Overview

Transcription Features

// Introduction

Most downstream analytics processes rely on transcribed text, making speech-to-text one of the most valuable features in any contact center solution. The noises that are common in call centers – for example, other agents conversing in the background – the conversational flow often branching off in multiple directions, and poor quality telephone signals pose unique challenges in accurately transcribing calls. Powered by decades of research and billions of contact center interactions, ElevateAI by NICE provides unparalleled conversational accuracy and immediate access to transcripts, formatted phrase-by-phrase or punctuated, sentence-by-sentence.



// Models

As of 11/22/2024 we have introduced a new model, echo. echodelivers a significant leap in accuracy, outperforming our previous CX-focused model by 40%, ensuring superior transcription quality across every use case. To take advantage of the Echo model, add the body parameter model to your POST declare and set the value to echo as part of declaring an interaction.

NOTE: Currently the Echo model does not diarize between speakers, perform redaction or process for CX-AI. Future releases will move echo towards feature parity with the cxmodel.

Our current cxmodel can be utilized by omitting the model parameter or setting it to cx

// PII Redaction

NOTE: PII Redaction is performed on media processed with the CX transcription model. Media processed with Echo are not redacted.

Personally Identifiable Information (PII) associated with social social security numbers, credit card numbers and CVV numbers is automatically redacted prior to the transcription being stored or returned to you.



Details related to the timing, type, and confidence for each redacted segment are returned when retrieving either a phrase-by-phrase or punctuated transcript.



Remember: All source data is deleted immediately and promptly upon processing. See data retention & security for additional detail



Sample Transcription Output



// Speaker Labels

ElevateAI automatically identifies and associates each phrase with a given participant (up to two) in the phrase-by-phrase and punctuated transcript.

Mono Audio Speaker Diarization

For single channel audio files processed with the CX transcription model, ElevateAI leverages speaker diarization to identify and label each phrase within the transcript as one from participantOne or participantTwo.

NOTE: This does not apply to mono media processed with the Echo transcription model.

Channel Labels

For dual channel audio files, such as a phone recording with the agent and customer on separate channels, ElevateAI will associate each phrase with a specific channel. Note that leveraging stereo recordings does not impact processing speed or fees.