EventRecast
GlossaryTechnology

Diarization

Also known as: speaker diarization

The process of identifying which speaker is talking at any given moment — answering 'who said what' in a multi-speaker recording.

Speaker diarization analyzes acoustic features (voice pitch, timbre, speaking rhythm) to segment audio by speaker. The output of diarization is typically labels like 'Speaker 1', 'Speaker 2' attached to each spoken segment, which can then be mapped to actual names in post-processing.

For live captioning, diarization enables speaker labels in transcripts — making panels, interviews, and Q&A sessions readable rather than presenting as undifferentiated text.

Modern ASR systems often include built-in diarization. Quality depends on audio segmentation: crisp audio with separate microphones per speaker produces excellent results; mixed audio from a single room mic produces noisier results.

See live captioning in action

EventRecast adds real-time captions, AI summaries, and searchable transcripts to any live event. Free trial, no credit card.

Start free trial