Introduction
Local-first voice AI for the web — on-device LLMs, a voice-agent runtime, and a call UI.
Valora is local-first voice AI for the web. Run language models on-device, drive a voice agent, and ship the call UI — with no inference server in the loop. Speech-to-text → on-device LLM → speech-out → a reactive call UI, all in the browser over WebGPU/WASM.
The pipeline
VAD ──► STT ──► [TurnDetector] ──► LLM ──► Speaker (TTS + Player)
│ │
Silero Moonshine / Whisper Kokoro-82M
│ │
LFM2.5 WebGPU kernels ◄── on-device, no serverEvery stage runs on the device. Models download once from Hugging Face on first load, then inference stays local.
Why local-first?
Cloud voice agents round-trip audio to a server for every turn — latency, a per-token bill, and a privacy boundary you don't control. Valora runs the whole pipeline as WebGPU compute kernels in the tab itself, so there's no API key, no server, and conversations never leave the device.
Three composable packages
Use one, or stack all three into a fully local in-browser voice assistant.
@valora-ai/voice
Framework-agnostic voice-agent runtime — STT stream, turn-taking, streaming TTS, a reactive store, and pluggable engines. Node + browser.
@valora-ai/ai-sdk
In-process AI SDK provider for local WebGPU language models. Pair it with @valora-ai/engine, @valora-ai/react/local, or the repo CLI.
@valora-ai/react
React UI kit for voice calls — particle orb, transcript rail, controls. A familiar room/participant surface bound to a local agent. React 18+.
How they fit together
@valora-ai/enginesupplies the on-device language-model engine and registry, while@valora-ai/ai-sdkexposes it throughgenerateText/streamText. The repo CLI inapps/cliruns the same WebGPU kernels under Bun; browser hooks live at@valora-ai/react/local.@valora-ai/voiceorchestrates those engines into a live voice agent: a single state machine (loading | idle | listening | thinking | speaking), barge-in via a monotonic turn token, sentence-streaming TTS, and auseSyncExternalStorereactive snapshot.@valora-ai/reactrenders the call: a familiar room/participant set of hooks and components (VoiceRoom,BarVisualizer,Transcript, control bar) bound to the agent.
The examples/teams-react app composes all three into a
Teams-style call screen running a fully local pipeline — no mocks, no server.
Requirements
- WebGPU — Chrome/Edge, or Safari with WebGPU enabled.
- Bun ≥ 1.3.14 for the CLI and package scripts.
Next: head to Getting Started.