GlossaryCaptioning types

Live captioning

Also known as: real-time captioning

Real-time text rendering of spoken audio, displayed alongside an event as it happens.

Live captioning produces text for spoken audio as the audio is being spoken — typically with a 1.5-3 second end-to-end latency between speech and on-screen text. The audience reads along while the speaker is still talking, rather than reading a transcript afterwards.

Live captioning serves multiple audiences: deaf and hard-of-hearing attendees, non-native-language speakers, attendees in noisy or shared environments, and increasingly, anyone who prefers reading-while-listening as a default mode of consumption.

Modern live captioning is delivered by Automatic Speech Recognition (ASR) systems, often supplemented with custom vocabulary per event for technical terms. Human-captioner workflows (CART) remain in use for high-stakes content where accuracy must exceed automated capabilities.

Related terms

ASR
CART
Real-time captioning
Closed captions
WCAG 2.2

Related terms

See live captioning in action