Skip to main content

Speech Recognition Dictionaries

Dictionaries are short lists of domain-specific terms — brand names, product SKUs, technical jargon, people, places — that you attach to an agent. The speech-to-text engine biases its transcription toward those terms, so the agent hears “Vocobase” instead of “Vocal base” and “SKU-4471” instead of “skew four thousand four hundred seventy one”.
Dictionaries only affect agents on a transcription configuration that supports vocabulary biasing. If you are not sure whether your account is enabled for dictionaries, contact Vocobase support.
Partner API (v2) only. These endpoints live under https://api.vocobase.com/api/v2 and are reached with a rg_live_... Bearer token.

When dictionaries help most

  • Proper nouns the model has never seen (your company name, product names, internal codes)
  • Ambiguous homophones where context alone does not disambiguate (e.g., “Mira” vs “Meera”)
  • Alphanumeric identifiers the user speaks slowly (serial numbers, reference codes)
  • Non-English-origin words embedded in an English call (e.g., Indian names, Spanish towns)
They do not help with:
  • Accent adaptation (that is handled by the language model itself)
  • Entire sentences or long phrases (dictionaries are a per-term bias, not a grammar)
  • Replacing STT output with a canonical form (use your own post-processing for that)

Limits

LimitValue
Dictionary name1–100 characters, unique per partner
Description0–500 characters
Terms per dictionaryUnlimited
Combined characters across all dictionaries attached to one agent10,000
Term lengthNo hard limit; short phrases (1–4 words) work best
The 10,000-character cap is enforced at attachment time and on PATCH, so you cannot accidentally push an agent over the limit.

Create a dictionary

Creation is agent-agnostic. POST /api/v2/dictionaries does not take an agent_id — dictionaries are reusable across agents. To make a dictionary actually bias an agent’s STT, attach it in a separate step with PUT /api/v2/agent/{agent_id}/dictionaries.
curl -X POST https://api.vocobase.com/api/v2/dictionaries \
  -H "Authorization: Bearer rg_live_abc123def456ghi789jkl012" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Vocobase Product Catalog",
    "description": "Product names, SKUs, and plan tiers referenced on support calls.",
    "terms": [
      "Vocobase",
      "SKU-4471",
      "Pro Annual Plan",
      "Voice Sandbox",
      "Meera Iyer"
    ]
  }'
{
  "success": true,
  "data": {
    "id": "d1234567-abcd-1234-abcd-123456789012",
    "name": "Vocobase Product Catalog",
    "description": "Product names, SKUs, and plan tiers referenced on support calls.",
    "terms": ["Vocobase", "SKU-4471", "Pro Annual Plan", "Voice Sandbox", "Meera Iyer"],
    "term_count": 5,
    "char_count": 52,
    "attached_agent_count": 0,
    "created_at": "2026-04-24T10:30:00.000Z",
    "updated_at": "2026-04-24T10:30:00.000Z"
  }
}
Terms are normalized server-side: whitespace is trimmed, empty strings are dropped, non-string entries are silently skipped, and the dedupe is case-insensitive but preserves the first-occurrence casing. Passing ["Vocobase", "vocobase", " Vocobase "] stores ["Vocobase"], and term_count reflects the normalized list.
A blank or whitespace-only description is stored as null.

Attach dictionaries to an agent

Attachment is a set-replace. PUT /agent/{agent_id}/dictionaries overwrites the agent’s entire dictionary set with the dictionary_ids array you send. There is no separate attach/detach endpoint — to add one, send the current ids plus the new one; to detach one, send the current ids minus the one to remove; to detach everything, send []. The swap is wrapped in a transaction, so the agent is never briefly in a partial state.
curl -X PUT https://api.vocobase.com/api/v2/agent/{agent_id}/dictionaries \
  -H "Authorization: Bearer rg_live_abc123def456ghi789jkl012" \
  -H "Content-Type: application/json" \
  -d '{
    "dictionary_ids": [
      "d1234567-abcd-1234-abcd-123456789012"
    ]
  }'
{
  "success": true,
  "data": {
    "dictionaries": [
      {
        "id": "d1234567-abcd-1234-abcd-123456789012",
        "name": "Vocobase Product Catalog",
        "term_count": 5,
        "char_count": 52,
        "attached_agent_count": 1,
        "created_at": "2026-04-24T10:30:00.000Z",
        "updated_at": "2026-04-24T10:30:00.000Z"
      }
    ]
  }
}
Every id in dictionary_ids must exist and belong to the calling partner; otherwise the whole request fails with 400 VALIDATION_ERROR. Duplicate ids are deduped silently. If the combined char_count across the requested set would exceed 10,000 characters, the request fails with 400 DICTIONARY_CHAR_LIMIT_EXCEEDED:
{
  "success": false,
  "error": {
    "code": "DICTIONARY_CHAR_LIMIT_EXCEEDED",
    "message": "Attaching these dictionaries would exceed the 10,000-character limit for this agent.",
    "details": {
      "current_chars": 11240,
      "limit_chars": 10000
    }
  }
}
Detach a less-relevant dictionary, or split a large dictionary into smaller, more targeted ones.
Bigger is not better. A focused 200-term dictionary biases the STT engine more effectively than a 2,000-term grab-bag because the engine has fewer distractors to weigh. Keep dictionaries narrow.

List an agent’s attached dictionaries

curl -X GET https://api.vocobase.com/api/v2/agent/{agent_id}/dictionaries \
  -H "Authorization: Bearer rg_live_abc123def456ghi789jkl012"
Returns the summary shape (no description, no terms) for every dictionary currently attached to the agent.

List all dictionaries

curl -X GET https://api.vocobase.com/api/v2/dictionaries \
  -H "Authorization: Bearer rg_live_abc123def456ghi789jkl012"
{
  "success": true,
  "data": {
    "dictionaries": [
      {
        "id": "d1234567-abcd-1234-abcd-123456789012",
        "name": "Vocobase Product Catalog",
        "description": "Product names, SKUs, and plan tiers referenced on support calls.",
        "term_count": 5,
        "char_count": 52,
        "attached_agent_count": 1,
        "created_at": "2026-04-24T10:30:00.000Z",
        "updated_at": "2026-04-24T10:30:00.000Z"
      }
    ]
  }
}
Returns every dictionary owned by the partner, ordered by created_at descending. The list endpoint does not include terms — fetch a single dictionary to see them.

Get a single dictionary

curl -X GET https://api.vocobase.com/api/v2/dictionaries/d1234567-abcd-1234-abcd-123456789012 \
  -H "Authorization: Bearer rg_live_abc123def456ghi789jkl012"
Returns the full detail shape including terms.

Update a dictionary

PATCH any subset of name, description, or terms. Passing terms replaces the whole list — there is no incremental add/remove. Merge the lists client-side before sending.
curl -X PATCH https://api.vocobase.com/api/v2/dictionaries/d1234567-abcd-1234-abcd-123456789012 \
  -H "Authorization: Bearer rg_live_abc123def456ghi789jkl012" \
  -H "Content-Type: application/json" \
  -d '{
    "terms": ["Vocobase", "SKU-4471", "SKU-4472", "Pro Annual Plan"]
  }'
If the dictionary is attached to any agent and the new char_count would push that agent over 10,000 combined characters, the request fails with 400 DICTIONARY_CHAR_LIMIT_EXCEEDED. The error details include the offending agent_id:
{
  "success": false,
  "error": {
    "code": "DICTIONARY_CHAR_LIMIT_EXCEEDED",
    "message": "This update would push agent a1b2c3d4-e5f6-7890-abcd-ef1234567890 over the 10,000-character limit.",
    "details": {
      "agent_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "current_chars": 10820,
      "limit_chars": 10000
    }
  }
}
Changes take effect on the next session that starts. In-flight calls continue with the snapshot taken at session start.

Delete a dictionary

curl -X DELETE https://api.vocobase.com/api/v2/dictionaries/d1234567-abcd-1234-abcd-123456789012 \
  -H "Authorization: Bearer rg_live_abc123def456ghi789jkl012"
Returns 204 No Content. The dictionary is removed and detached from every agent it was attached to. No other data is touched.

Tips for authoring good dictionaries

  • Write terms the way they are pronounced, not the way they are written. If your product “GraphQL” is pronounced “graph Q L”, add both GraphQL and graph Q L.
  • Group by domain, not by size. One dictionary per product line, region, or use case is more useful than one mega-dictionary.
  • Audit with real transcripts. Pull a week of call recordings and look for consistent mis-transcriptions — those are your dictionary candidates.
  • Remove, don’t just add. Terms that never appear in your actual conversations add noise. Trim after two weeks of usage.

Troubleshooting

SymptomLikely cause
Transcripts still wrong after attachingDictionary not yet effective on this call — dictionaries apply at session start. Start a new call.
400 DICTIONARY_CHAR_LIMIT_EXCEEDED on attach or updateThis agent’s combined term characters would exceed 10,000. Detach something else or split the dictionary. Check error.details for the offending agent_id and current_chars.
400 VALIDATION_ERROR on PUT /agent/{id}/dictionaries with message “One or more dictionary_ids do not exist or are not owned by you”One of the ids you sent is either wrong or belongs to another partner. The whole request is rejected; fix the list and retry.
Dictionary has no effect on accuracyYour account or agent may not be using a transcription configuration that supports dictionaries. Contact Vocobase support to verify.
409 DICTIONARY_NAME_TAKEN on create or updateDictionary names must be unique per partner. Rename or update the existing one.