Documentation Index Fetch the complete documentation index at: https://docs.vocobase.com/llms.txt
Use this file to discover all available pages before exploring further.
Web Voice Testing
Test your Vocobase voice agent directly from a web application using real-time WebRTC audio. This guide covers everything you need to connect your frontend to a voice agent — works with any JavaScript framework or vanilla JS.
How it works
Your web app starts a voice session via the Vocobase API, receives WebRTC credentials, and connects using the open-source Pipecat client SDK. Audio flows in real time — your user speaks, the agent responds.
Your App Vocobase API WebRTC
| | |
| POST /api/v2/sessions/webrtc | |
| { agent_id } | |
| ----------------------------> | |
| | (starts bot + room) |
| <---------------------------- | |
| { session_id, | |
| daily_room_url, | |
| daily_token } | |
| | |
| client.connect({ url, token }) |
| --------------------------------------------------------> |
| | |
| <========= real-time voice conversation (WebRTC) =======> |
| | |
| client.disconnect() |
| --------------------------------------------------------> |
| | |
| | (session ends, |
| | credits deducted, |
| | session.completed |
| | webhook fires) |
Install dependencies
npm install @pipecat-ai/client-js @pipecat-ai/daily-transport
For React projects, also install:
npm install @pipecat-ai/client-react
Package Purpose @pipecat-ai/client-jsCore client — manages connection, events, audio @pipecat-ai/daily-transportWebRTC transport layer @pipecat-ai/client-reactReact provider + audio component (optional)
Start a session
Call the Vocobase API to create a voice session. The response contains WebRTC credentials you’ll use to connect.
curl -X POST https://api.vocobase.com/api/v2/sessions/webrtc \
-H "Authorization: Bearer rg_live_abc123def456ghi789jkl012" \
-H "Content-Type: application/json" \
-d '{"agent_id": "a1234567-abcd-1234-abcd-123456789012"}'
Response:
{
"success" : true ,
"data" : {
"session_id" : "550e8400-e29b-41d4-a716-446655440000" ,
"agent_id" : "a1234567-abcd-1234-abcd-123456789012" ,
"daily_room_url" : "https://voice-session.example.com/abc123" ,
"daily_token" : "eyJhbGciOi..."
}
}
Field Description session_idUnique session identifier. Echoed in the session.completed webhook. agent_idThe agent handling this session. daily_room_urlWebRTC room URL. Pass this to client.connect(). daily_tokenOne-time access token for the room. Pass this to client.connect().
You can also pass variables in the request body to substitute pre-call values into the agent’s prompt and greeting (see Pre-call Variables ).
Error responses
Status Cause Action 400Missing or invalid agent_id Check the request body 401Invalid API key Verify key format (rg_live_*) 403Account not active or no V2 access Contact Vocobase support 404Agent not found Verify the agent UUID and that it’s active 429Rate or concurrency limit Wait and retry 502Voice session temporarily unavailable Retry after a few seconds
Never expose your API key in client-side code. Proxy the session start call through your own backend — your backend adds the API key, gets the credentials, and returns daily_room_url + daily_token to the browser. The WebRTC connection itself requires no API key.
Connect with vanilla JavaScript
This approach works with any framework — plain JS, Vue, Angular, Svelte, or no framework at all.
Basic connection
import { PipecatClient , RTVIEvent } from "@pipecat-ai/client-js" ;
import { DailyTransport } from "@pipecat-ai/daily-transport" ;
// Handle bot audio playback
function handleBotAudio ( track , participant ) {
if ( participant . local || track . kind !== "audio" ) return ;
const audio = document . createElement ( "audio" );
audio . srcObject = new MediaStream ([ track ]);
audio . play ();
}
// Create client (once per page)
const client = new PipecatClient ({
transport: new DailyTransport (),
enableMic: true ,
enableCam: false ,
callbacks: {
onTrackStarted: handleBotAudio ,
onBotReady : () => console . log ( "Agent is ready" ),
},
});
// Start session via your backend (which calls Vocobase API)
const res = await fetch ( "/api/voice-session" , { method: "POST" });
const session = await res . json ();
// Connect to the voice agent
await client . connect ({
url: session . daily_room_url ,
token: session . daily_token ,
});
Listen to events
// User's speech (transcribed in real time)
client . on ( RTVIEvent . UserTranscript , ( data ) => {
if ( ! data . final ) return ; // Only use final transcripts
console . log ( "User:" , data . text );
});
// Agent's response (arrives in TTS chunks — see aggregation below)
client . on ( RTVIEvent . BotTtsText , ( data ) => {
console . log ( "Agent:" , data . text );
});
// Connection lifecycle
client . on ( RTVIEvent . Connected , () => console . log ( "Connected" ));
client . on ( RTVIEvent . Disconnected , () => console . log ( "Disconnected" ));
client . on ( RTVIEvent . BotReady , () => console . log ( "Agent ready to talk" ));
// Errors
client . on ( RTVIEvent . Error , ( message ) => {
console . error ( "Error:" , message ?. data ?. message || message );
});
End the call
await client . disconnect ();
Microphone controls
// Mute
client . enableMic ( false );
// Unmute
client . enableMic ( true );
Aggregating bot responses
RTVIEvent.BotTtsText fires once per TTS chunk, not once per complete response. To build full agent messages, aggregate chunks until the next user turn:
let currentBotMessage = "" ;
client . on ( RTVIEvent . BotTtsText , ( data ) => {
currentBotMessage += ( currentBotMessage ? " " : "" ) + data . text ;
updateLastAgentMessage ( currentBotMessage );
});
client . on ( RTVIEvent . UserTranscript , ( data ) => {
if ( ! data . final ) return ;
currentBotMessage = "" ; // Reset for next agent turn
addUserMessage ( data . text );
});
Complete HTML example
A self-contained page you can use to test your integration:
<! DOCTYPE html >
< html >
< head >< title > Voice Agent Test </ title ></ head >
< body >
< div id = "status" > Ready </ div >
< div id = "transcript" ></ div >
< button id = "startBtn" > Start Call </ button >
< button id = "endBtn" disabled > End Call </ button >
< script type = "module" >
import { PipecatClient , RTVIEvent } from "@pipecat-ai/client-js" ;
import { DailyTransport } from "@pipecat-ai/daily-transport" ;
const statusEl = document . getElementById ( "status" );
const transcriptEl = document . getElementById ( "transcript" );
const startBtn = document . getElementById ( "startBtn" );
const endBtn = document . getElementById ( "endBtn" );
function handleBotAudio ( track , participant ) {
if ( participant . local || track . kind !== "audio" ) return ;
const audio = document . createElement ( "audio" );
audio . srcObject = new MediaStream ([ track ]);
audio . play ();
}
const client = new PipecatClient ({
transport: new DailyTransport (),
enableMic: true ,
enableCam: false ,
callbacks: { onTrackStarted: handleBotAudio },
});
let currentBotMsg = "" ;
function addMessage ( role , text ) {
const div = document . createElement ( "div" );
div . textContent = ` ${ role } : ${ text } ` ;
transcriptEl . appendChild ( div );
}
client . on ( RTVIEvent . UserTranscript , ( data ) => {
if ( ! data . final ) return ;
currentBotMsg = "" ;
addMessage ( "You" , data . text );
});
client . on ( RTVIEvent . BotTtsText , ( data ) => {
currentBotMsg += ( currentBotMsg ? " " : "" ) + data . text ;
// Update last agent message or add new one
const lastDiv = transcriptEl . lastElementChild ;
if ( lastDiv && lastDiv . textContent . startsWith ( "Agent:" )) {
lastDiv . textContent = "Agent: " + currentBotMsg ;
} else {
addMessage ( "Agent" , currentBotMsg );
}
});
client . on ( RTVIEvent . Connected , () => {
statusEl . textContent = "Connected" ;
startBtn . disabled = true ;
endBtn . disabled = false ;
});
client . on ( RTVIEvent . Disconnected , () => {
statusEl . textContent = "Disconnected" ;
startBtn . disabled = false ;
endBtn . disabled = true ;
});
client . on ( RTVIEvent . Error , ( msg ) => {
statusEl . textContent = "Error: " + ( msg ?. data ?. message || "Unknown" );
});
startBtn . addEventListener ( "click" , async () => {
statusEl . textContent = "Connecting..." ;
startBtn . disabled = true ;
try {
// Replace with your backend endpoint
const res = await fetch ( "/api/voice-session" , { method: "POST" });
if ( ! res . ok ) throw new Error ( `HTTP ${ res . status } ` );
const session = await res . json ();
await client . connect ({
url: session . daily_room_url ,
token: session . daily_token ,
});
} catch ( err ) {
statusEl . textContent = "Error: " + err . message ;
startBtn . disabled = false ;
}
});
endBtn . addEventListener ( "click" , async () => {
statusEl . textContent = "Disconnecting..." ;
await client . disconnect ();
});
</ script >
</ body >
</ html >
Connect with React
The @pipecat-ai/client-react package provides PipecatClientProvider and PipecatClientAudio — a provider for context and a component that automatically handles bot audio playback (replacing the manual onTrackStarted approach).
import { useState , useEffect , useCallback } from "react" ;
import { PipecatClient , RTVIEvent } from "@pipecat-ai/client-js" ;
import { DailyTransport } from "@pipecat-ai/daily-transport" ;
import {
PipecatClientProvider ,
PipecatClientAudio ,
} from "@pipecat-ai/client-react" ;
interface TranscriptEntry {
role : "user" | "agent" ;
content : string ;
}
function VoiceChat ({ agentName } : { agentName : string }) {
const [ client , setClient ] = useState < PipecatClient | null >( null );
const [ status , setStatus ] = useState <
"idle" | "connecting" | "connected" | "error"
> ( "idle" );
const [ transcript , setTranscript ] = useState < TranscriptEntry []>([]);
const [ error , setError ] = useState < string | null >( null );
const [ isMuted , setIsMuted ] = useState ( false );
// Create client once
useEffect (() => {
const pc = new PipecatClient ({
transport: new DailyTransport (),
enableMic: true ,
enableCam: false ,
});
setClient ( pc );
return () => { pc . disconnect (). catch (() => {}); };
}, []);
// Subscribe to events
useEffect (() => {
if ( ! client ) return ;
const onUserTranscript = ( data : any ) => {
if ( ! data . final ) return ;
setTranscript (( prev ) => [
... prev ,
{ role: "user" , content: data . text },
]);
};
const onBotTtsText = ( data : any ) => {
setTranscript (( prev ) => {
const last = prev [ prev . length - 1 ];
if ( last ?. role === "agent" ) {
return [
... prev . slice ( 0 , - 1 ),
{ ... last , content: last . content + " " + data . text },
];
}
return [ ... prev , { role: "agent" , content: data . text }];
});
};
const onConnected = () => setStatus ( "connected" );
const onDisconnected = () => { setStatus ( "idle" ); setIsMuted ( false ); };
const onError = ( msg : any ) => {
setError ( msg ?. data ?. message || msg ?. data ?. error || "Connection error" );
setStatus ( "error" );
};
client . on ( RTVIEvent . UserTranscript , onUserTranscript );
client . on ( RTVIEvent . BotTtsText , onBotTtsText );
client . on ( RTVIEvent . Connected , onConnected );
client . on ( RTVIEvent . Disconnected , onDisconnected );
client . on ( RTVIEvent . Error , onError );
return () => {
client . off ( RTVIEvent . UserTranscript , onUserTranscript );
client . off ( RTVIEvent . BotTtsText , onBotTtsText );
client . off ( RTVIEvent . Connected , onConnected );
client . off ( RTVIEvent . Disconnected , onDisconnected );
client . off ( RTVIEvent . Error , onError );
};
}, [ client ]);
const connect = useCallback ( async () => {
if ( ! client || status === "connecting" ) return ;
setError ( null );
setStatus ( "connecting" );
setTranscript ([]);
try {
// Call YOUR backend, which proxies to Vocobase API
const res = await fetch ( "/api/voice-session" , { method: "POST" });
if ( ! res . ok ) throw new Error ( `HTTP ${ res . status } ` );
const session = await res . json ();
await client . connect ({
url: session . daily_room_url ,
token: session . daily_token ,
});
} catch ( err ) {
setError ( err instanceof Error ? err . message : "Failed to connect" );
setStatus ( "error" );
}
}, [ client , status ]);
const disconnect = useCallback ( async () => {
if ( ! client ) return ;
await client . disconnect ();
}, [ client ]);
const toggleMute = useCallback (() => {
if ( ! client ) return ;
client . enableMic ( isMuted );
setIsMuted ( ! isMuted );
}, [ client , isMuted ]);
if ( ! client ) return null ;
return (
< PipecatClientProvider client = { client } >
< div >
< p > Status: { status } </ p >
{ error && < p style = { { color: "red" } } > { error } </ p > }
{ transcript . map (( entry , i ) => (
< div key = { i } >
< strong > { entry . role === "user" ? "You" : "Agent" } : </ strong > { " " }
{ entry . content }
</ div >
)) }
{ status === "connected" ? (
<>
< button onClick = { toggleMute } >
{ isMuted ? "Unmute" : "Mute" }
</ button >
< button onClick = { disconnect } > End Call </ button >
</>
) : (
< button
onClick = { connect }
disabled = { status === "connecting" }
>
{ status === "connecting" ? "Connecting..." : "Start Call" }
</ button >
) }
{ /* Handles bot audio playback automatically */ }
{ status === "connected" && < PipecatClientAudio /> }
</ div >
</ PipecatClientProvider >
);
}
export default function App () {
return < VoiceChat agentName = "my-agent" /> ;
}
PipecatClientAudio renders a hidden <audio> element that plays the agent’s voice. Mount it when connected — it replaces the manual onTrackStarted callback used in the vanilla JS approach.
Events reference
Event Payload Description RTVIEvent.Connected— WebRTC connection established RTVIEvent.Disconnected— Connection closed RTVIEvent.BotReady— Agent initialized and ready to talk RTVIEvent.UserTranscript{ text, final }User’s speech transcribed. Only use entries where final is true. RTVIEvent.BotTtsText{ text }Agent’s response text. Arrives in chunks — aggregate them per turn. RTVIEvent.TransportStateChangedstate: stringLow-level transport state changes RTVIEvent.Error{ data: { message, error } }Connection or runtime error
Backend proxy example
Your backend should proxy the session start call to keep the API key server-side.
const express = require ( "express" );
const app = express ();
app . post ( "/api/voice-session" , async ( req , res ) => {
try {
const response = await fetch (
"https://api.vocobase.com/api/v2/sessions/webrtc" ,
{
method: "POST" ,
headers: {
"Authorization" : `Bearer ${ process . env . VOCOBASE_API_KEY } ` ,
"Content-Type" : "application/json" ,
},
body: JSON . stringify ({ agent_id: process . env . VOCOBASE_AGENT_ID }),
}
);
const body = await response . json ();
if ( ! response . ok || ! body . success ) {
return res . status ( response . status ). json ( body );
}
// Forward only the WebRTC credentials to the browser.
res . json ( body . data );
} catch ( err ) {
res . status ( 500 ). json ({ error: "Failed to start voice session" });
}
});
Receiving the transcript and recording
After the session ends, the platform fires a session.completed webhook to each enabled webhook endpoint containing the transcript, recording URL, credit usage, pre-call variables, and any post-call extraction. The call block is omitted for browser WebRTC sessions because there’s no associated phone call.
Configure webhook endpoints with POST /api/v2/config/webhooks; enabled endpoints receive events for both telephony and WebRTC sessions. See Webhook Setup for setup and Webhook Payloads for the full schema.
If your stack can’t accept inbound webhooks, poll GET /api/v2/sessions/{session_id} instead — it returns the same fields with a freshly-minted recording URL on each call:
curl -X GET https://api.vocobase.com/api/v2/sessions/SESSION_ID \
-H "Authorization: Bearer rg_live_abc123def456ghi789jkl012"
Billing
Voice sessions are billed at 1 credit per 60 seconds , pro-rata by the second. Credits are deducted after the session ends.
See Credits & Billing for full details.
Rate limits
Limit Value Session starts 10 per minute per API keyConcurrent sessions Configurable per key (default: 5 )
Rate limit headers are included on all responses: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and Retry-After (on 429).
See Authentication for full rate limit details.
Troubleshooting
Issue Solution No audio from agent Vanilla JS: Ensure onTrackStarted creates an <audio> element and calls .play(). React: Ensure <PipecatClientAudio /> is mounted when connected. Browsers may block autoplay — always connect via a user gesture (button click).Microphone not working Check navigator.mediaDevices.getUserMedia permission. HTTPS is required in production. Connection drops immediately Verify daily_room_url and daily_token from the API are passed correctly to client.connect(). Tokens are single-use. Multiple audio elements In vanilla JS, track the <audio> element and remove the old one before creating a new one in onTrackStarted. enableCam: false but camera promptThe WebRTC transport layer may request camera access internally. Setting enableCam: false ensures the camera is never activated.
HTTPS is required for microphone access in all browsers except localhost. Your production deployment must use HTTPS.
Next steps
Credits & Billing Understand how voice sessions consume credits.
Authentication API key format, rate limits, and error codes.
Webhook Payloads Receive session.completed events with transcripts and duration.
Quick Start Create an agent and make your first call.