Why Accurate Call Transcription Matters – and How Saicom Delivers Industry-Leading Accuracy

At Saicom, we know that understanding what’s really happening in your contact centre is critical to delivering great customer experiences. Whether you’re using automated quality assurance, AI-driven insights, or analytics tools, everything starts with one essential ingredient – accurate transcription.

The challenge? Real-world call recordings are rarely perfect. Background noise, muffled microphones, and mobile carrier compression can all distort audio long before it reaches your contact centre. This is where Word Error Rate (WER) – the industry’s standard measure of transcription accuracy – comes in.

What is Word Error Rate?

WER calculates how many words in a transcription are incorrect. For example:

Customer says: The cat sat on the mat
Transcript says: The rat sat on the mat

Here, only one word is wrong, giving a WER of 16.7% (1 error in 6 words).

WER counts:

Substitutions (wrong words)
Deletions (missing words)
Additions (extra words not spoken)

It’s not perfect, but it’s an excellent way to compare different transcription models.

Why Traditional Speech Recognition Struggles

For years, the big names – Microsoft Azure (formerly Dragon Speech) and Google Cloud – have dominated the market. While they perform well in studio-quality audio, telephone audio is far trickier.

Even Google’s optimised telephone speech model still averages 14.29% WER, meaning about 14% of a transcript contains errors. That’s good – but in the fast-moving world of AI, “good” can be beaten.

Saicom’s Leap Forward with QContact

Through our partnership with QContact, we now use cutting-edge NVIDIA speech technology (publicly released just this week) to achieve a WER of only 6.05% – that’s:

58% more accurate than Google’s telephone model
15% more accurate than the latest OpenAI Whisper models

We’ve also expanded automatic transcription to cover 20+ European languages, delivering world-class results like:

Spanish: 3.72% WER
German: 4.90% WER
French: 5.38% WER
Portuguese: 5.95% WER

According to the Open ASR Leaderboard, this is now the second most accurate and fastest multilingual speech recognition model in the world.

Privacy and Compliance You Can Trust

When choosing a CCaaS provider, it’s worth asking:

Where is my audio processed?
Is my data being used to train AI models without consent?

Some providers send recordings to countries without GDPR or POPIA equivalence, or allow training on your data in exchange for discounts from transcription vendors. This could expose you to compliance and privacy risks.

At Saicom, we guarantee:

Your call recordings stay within South Africa for processing
Your data will never be used for training without consent
Call transcription is included free for all customers

And because QContact isn’t tied to a single AI model, we can quickly adopt the latest and best speech recognition technology – without you lifting a finger.

Share this article

Why Accurate Call Transcription Matters – and How Saicom Delivers Industry-Leading Accuracy

What is Word Error Rate?

Why Traditional Speech Recognition Struggles

Saicom’s Leap Forward with QContact

Privacy and Compliance You Can Trust

Login