Skip to main content

AI Transcription

The One Voice automatically transcribes calls using Azure OpenAI's Whisper model, producing searchable, speaker-labeled transcripts for every recorded call.

How It Works

  1. A call is recorded (based on the org/queue/extension recording policy)
  2. When the call ends, the recording is submitted to Azure OpenAI Whisper
  3. The transcript is processed with speaker diarization — labels are caller and agent
  4. The transcript is stored and linked to the call record
  5. A Call Recap is automatically generated from the transcript (see AI Transcription and Recaps)

Transcription runs asynchronously — transcripts typically appear within 60–90 seconds of call end.

Enabling AI Features

AI transcription requires AI features to be enabled for your org:

  1. Go to Settings → AI
  2. Click Review AI Terms of Service
  3. Accept the terms to enable AI features for your org

Once enabled, transcription runs automatically on all recorded calls with no additional configuration.

Supported Languages

Whisper supports 57 languages. The model auto-detects the language spoken — no configuration required. For calls that mix languages (code-switching), the model transcribes whichever language dominates each segment.

Languages with strongest accuracy: English, Spanish, French, German, Portuguese, Japanese, Chinese (Mandarin), Italian, Dutch, Polish, Russian.

Viewing Transcripts

From Recordings:

  1. Go to Recordings
  2. Find the call and click to expand
  3. Click the Transcript tab
  4. The transcript appears with speaker labels (Caller: / Agent:) and timestamps

From Call Recaps:

  1. Go to Call Recaps
  2. Open any recap
  3. The full transcript is shown in the Transcript tab alongside the AI summary

From the Softphone (Live): During an active call, a supervisor can view the live transcript in real time via the Call Monitor page. See Live Transcription below.

Searching Transcripts

The transcript search is available in Recordings — use the search bar to search across all transcripts by keyword. This is useful for:

  • Finding all calls that mentioned a specific customer name or issue
  • Compliance auditing (searching for policy violation language)
  • QA workflows (searching for specific phrases agents should or should not say)

Exporting Transcripts

From any transcript view, click Export to download the transcript as:

  • Text (.txt) — plain text with speaker labels and timestamps
  • JSON — structured format for import into other tools

Bulk export is available from the Recordings page — select multiple recordings and click Export Transcripts.

AI Call Recaps

A Call Recap is an AI-generated structured summary produced automatically after each transcription completes.

Each recap contains:

FieldDescription
Summary2–4 sentence narrative of the call
Sentimentpositive / neutral / negative based on the conversation tone
TopicsKey subjects discussed (auto-detected)
Action ItemsExtracted next steps, each tagged high / medium / low priority
Follow-up RecommendationAI suggestion for what to do next (e.g., "Schedule a demo call")

To view recaps:

  1. Go to Call Recaps in the left navigation
  2. Browse recaps sorted by date, or filter by extension, sentiment, or date range
  3. Click any recap for the full view including the linked transcript

Recaps are also shown inline in the softphone Call History tab for quick reference after a call.

Live Transcription

Supervisors and team leads can view a real-time transcript during an active call (listen-only, does not interrupt the call).

To view a live transcript:

  1. Go to Call Monitor
  2. Find the active call in the live calls table
  3. Click View Transcript (document icon)
  4. A modal opens showing the transcript as it updates every ~2 seconds

Live transcription uses Telnyx's media streaming API to send audio to the transcription service in real time. It works alongside call monitoring (listen/whisper/barge) and does not require recording to be enabled.

Accuracy Notes

  • Background noise and poor audio quality degrade accuracy
  • Heavy accents or fast speech may produce occasional errors
  • Technical jargon and product names may be misspelled (e.g., "Telnyx" may appear as "Telnicks")
  • Transcripts are AI-generated and should not be treated as verbatim legal records without human review