Overview
Transcription Features
9 min
// // introduction accurate transcription is critical for downstream analytics, making speech to text (stt) one of the most valuable features across contact center solutions however, call center environments present unique challenges for transcription, including background noise (e g , other agents speaking), overlapping conversations, and poor audio quality due to telephony limitations elevateai by nice leverages decades of research and insights from billions of contact center interactions to deliver highly accurate conversational transcripts via our purpose built cx model and our next generation echo transcription model transcripts are available in two formats phrase by phrase punctuated, sentence by sentence this enables flexible integration into analytics, compliance, and customer experience (cx) workflows // // transcription models what's new in november 2024, we introduced echo , our next generation transcription model with echo, elevateai delivers a significant leap in accuracy, outperforming our original, cx focused model by 40% and ensuring superior transcription quality across every use case selecting the echo model to use the echo transcription model, add the body parameter model to your post declare and set the value to echo when declaring an interaction note currently the echo model does not perform redaction future releases will continue to move echo towards feature parity with the cx transcription model; click here to access the latest release notes switching between echo and cx models our original, purpose built cx model can be utilized by omitting the model parameter or setting it to cx // // pii redaction personally identifiable information (pii) associated with social security numbers , credit card numbers , and cvv/cvc numbers is automatically redacted priort to the transcription being stored or returned to you important pii redaction is available with the cx transcription model only the echo transcription model does not currently support pii redaction and media processed with echo will not be redacted details related to the timing, type, and confidence for each redacted segment are returned when retrieving either a phrase by phrase or punctuated transcript remember all source data is deleted immediately and promptly upon processing please refer to our data retention & security guidelines for additional details uidelines for additional details uidelines for additional details sample transcription output { "redactionsegments" \[ { "starttimeoffset" 1467720, "endtimeoffset" 1468030 "result" "cvv", "score" 0 98745 }, ] } // // speaker labels elevateai automatically identifies and associates each phrase with a given participant (up to two) in the phrase by phrase and punctuated transcript speaker diarization elevateai supports automatic speaker diarization for both mono (single channel) and stereo audio inputs using the cx transcription model we have recently launched automatic speaker diarization for stereo (dual channel) audio inputs using the echo transcription model , as we work to bring the two models to feature parity the system distinguishes between speakers – labeling them as participantone and participanttwo – and tags each phrase in the transcript accordingly key benefits of speaker diarization include improved transcript clarity each speaker's statements are clearly attributed, making transcripts easier to read and understand simplified post processing accurate speaker labels enable downstream systems to automate tasks like speaker based analytics, summarization, or role assignment enhanced compliance & qa clear separation of speakers helps ensure regulatory compliance and simplifies quality assurance workflows note diarization currently supports two speakers ( participantone , participanttwo ) and is automatically applied when supported models and audio formats are used channel labels for dual channel audio files, such as a phone recording with the agent and customer on separate channels, elevateai will associate each phrase with a specific channel leveraging stereo recordings does not impact processing speed or fees access pricing information read more about audio, chat, and transcript processing times