Introduction

Local-first voice AI for the web — on-device LLMs, a voice-agent runtime, and a call UI.

Valora is local-first voice AI for the web. Run language models on-device, drive a voice agent, and ship the call UI — with no inference server in the loop. Speech-to-text → on-device LLM → speech-out → a reactive call UI, all in the browser over WebGPU/WASM.

The pipeline

 VAD ──► STT ──► [TurnDetector] ──► LLM ──► Speaker (TTS + Player)
  │                                            │
Silero    Moonshine / Whisper            Kokoro-82M
              │                                │
        LFM2.5 WebGPU kernels  ◄── on-device, no server

Every stage runs on the device. Models download once from Hugging Face on first load, then inference stays local.

Why local-first?

Cloud voice agents round-trip audio to a server for every turn — latency, a per-token bill, and a privacy boundary you don't control. Valora runs the whole pipeline as WebGPU compute kernels in the tab itself, so there's no API key, no server, and conversations never leave the device.

@valora-ai/engine supplies the on-device language-model engine and registry, while @valora-ai/ai-sdk exposes it through generateText/streamText. The repo CLI in apps/cli runs the same WebGPU kernels under Bun; browser hooks live at @valora-ai/react/local.
@valora-ai/voice orchestrates those engines into a live voice agent: a single state machine (loading | idle | listening | thinking | speaking), barge-in via a monotonic turn token, sentence-streaming TTS, and a useSyncExternalStore reactive snapshot.
@valora-ai/react renders the call: a familiar room/participant set of hooks and components (VoiceRoom, BarVisualizer, Transcript, control bar) bound to the agent.

The examples/teams-react app composes all three into a Teams-style call screen running a fully local pipeline — no mocks, no server.

Requirements

WebGPU — Chrome/Edge, or Safari with WebGPU enabled.
Bun ≥ 1.3.14 for the CLI and package scripts.

Next: head to Getting Started.

Introduction

The pipeline

Why local-first?

Three composable packages

@valora-ai/voice

@valora-ai/ai-sdk

@valora-ai/react

How they fit together

Requirements

On this page