Speech Recognition Dictionaries
Dictionaries are short lists of domain-specific terms — brand names, product SKUs, technical jargon, people, places — that you attach to an agent. The speech-to-text engine biases its transcription toward those terms, so the agent hears “Vocobase” instead of “Vocal base” and “SKU-4471” instead of “skew four thousand four hundred seventy one”.Dictionaries only affect agents on a transcription configuration that supports vocabulary biasing. If you are not sure whether your account is enabled for dictionaries, contact Vocobase support.
Partner API (v2) only. These endpoints live under
https://api.vocobase.com/api/v2 and are reached with a rg_live_... Bearer token.When dictionaries help most
- Proper nouns the model has never seen (your company name, product names, internal codes)
- Ambiguous homophones where context alone does not disambiguate (e.g., “Mira” vs “Meera”)
- Alphanumeric identifiers the user speaks slowly (serial numbers, reference codes)
- Non-English-origin words embedded in an English call (e.g., Indian names, Spanish towns)
- Accent adaptation (that is handled by the language model itself)
- Entire sentences or long phrases (dictionaries are a per-term bias, not a grammar)
- Replacing STT output with a canonical form (use your own post-processing for that)
Limits
| Limit | Value |
|---|---|
| Dictionary name | 1–100 characters, unique per partner |
| Description | 0–500 characters |
| Terms per dictionary | Unlimited |
| Combined characters across all dictionaries attached to one agent | 10,000 |
| Term length | No hard limit; short phrases (1–4 words) work best |
PATCH, so you cannot accidentally push an agent over the limit.
Create a dictionary
Creation is agent-agnostic.
POST /api/v2/dictionaries does not take an agent_id — dictionaries are reusable across agents. To make a dictionary actually bias an agent’s STT, attach it in a separate step with PUT /api/v2/agent/{agent_id}/dictionaries.description is stored as null.
Attach dictionaries to an agent
Attachment is a set-replace.PUT /agent/{agent_id}/dictionaries overwrites the agent’s entire dictionary set with the dictionary_ids array you send. There is no separate attach/detach endpoint — to add one, send the current ids plus the new one; to detach one, send the current ids minus the one to remove; to detach everything, send []. The swap is wrapped in a transaction, so the agent is never briefly in a partial state.
dictionary_ids must exist and belong to the calling partner; otherwise the whole request fails with 400 VALIDATION_ERROR. Duplicate ids are deduped silently.
If the combined char_count across the requested set would exceed 10,000 characters, the request fails with 400 DICTIONARY_CHAR_LIMIT_EXCEEDED:
List an agent’s attached dictionaries
description, no terms) for every dictionary currently attached to the agent.
List all dictionaries
created_at descending. The list endpoint does not include terms — fetch a single dictionary to see them.
Get a single dictionary
terms.
Update a dictionary
PATCH any subset of name, description, or terms. Passing terms replaces the whole list — there is no incremental add/remove. Merge the lists client-side before sending.
char_count would push that agent over 10,000 combined characters, the request fails with 400 DICTIONARY_CHAR_LIMIT_EXCEEDED. The error details include the offending agent_id:
Delete a dictionary
204 No Content. The dictionary is removed and detached from every agent it was attached to. No other data is touched.
Tips for authoring good dictionaries
- Write terms the way they are pronounced, not the way they are written. If your product “GraphQL” is pronounced “graph Q L”, add both
GraphQLandgraph Q L. - Group by domain, not by size. One dictionary per product line, region, or use case is more useful than one mega-dictionary.
- Audit with real transcripts. Pull a week of call recordings and look for consistent mis-transcriptions — those are your dictionary candidates.
- Remove, don’t just add. Terms that never appear in your actual conversations add noise. Trim after two weeks of usage.
Troubleshooting
| Symptom | Likely cause |
|---|---|
| Transcripts still wrong after attaching | Dictionary not yet effective on this call — dictionaries apply at session start. Start a new call. |
400 DICTIONARY_CHAR_LIMIT_EXCEEDED on attach or update | This agent’s combined term characters would exceed 10,000. Detach something else or split the dictionary. Check error.details for the offending agent_id and current_chars. |
400 VALIDATION_ERROR on PUT /agent/{id}/dictionaries with message “One or more dictionary_ids do not exist or are not owned by you” | One of the ids you sent is either wrong or belongs to another partner. The whole request is rejected; fix the list and retry. |
| Dictionary has no effect on accuracy | Your account or agent may not be using a transcription configuration that supports dictionaries. Contact Vocobase support to verify. |
409 DICTIONARY_NAME_TAKEN on create or update | Dictionary names must be unique per partner. Rename or update the existing one. |