EventRecast
GlossaryTechnology

WER

Also known as: word error rate

Word Error Rate — the standard accuracy metric for speech recognition, measuring the percentage of words incorrectly transcribed.

WER is calculated as the sum of substitutions, insertions, and deletions in the produced transcript, divided by the total number of words in the reference. Lower WER is better; a WER of 0% would mean perfect transcription.

Modern ASR systems achieve WER below 5% on clean audio with a single speaker in major languages — well above the 'understandable' threshold relevant for accessibility compliance.

WER alone doesn't capture all dimensions of caption quality. A captioning system might have low overall WER but consistently mis-transcribe brand names, technical jargon, or speaker names — errors that matter more to readers than the average suggests. Custom vocabulary mitigates this.

See live captioning in action

EventRecast adds real-time captions, AI summaries, and searchable transcripts to any live event. Free trial, no credit card.

Start free trial