Live Auto-Captioning Accuracy Falls Below 80% While 99% Standard Is Required
socialsocial0 views
The industry standard for closed captioning accuracy is 99% (no more than 15 errors per 1,500 words), but most automatic speech recognition systems achieve only about 80% accuracy in real-world live captioning scenarios. At 95% accuracy, there is an error on average every 2.5 sentences, meaning even the best ASR systems produce captions that are unreliable for deaf viewers following complex material like news broadcasts, lectures, or legal proceedings.
Why it matters: Deaf and hard-of-hearing viewers cannot reliably follow live content through auto-captions. So what? They miss critical information in emergency broadcasts, educational settings, workplace meetings, and public events. So what? This information gap creates a parallel experience where deaf viewers are technically included but functionally excluded from understanding what is being communicated. So what? The widespread deployment of cheap ASR captioning has given content providers a justification to stop hiring human CART captioners, reducing the quality of captioning available even as the quantity increases. So what? The deaf community is experiencing a net regression in captioning quality at the precise moment when more content than ever is being produced, widening rather than narrowing the information access gap.
Structural root cause: Captioning regulation (FCC rules for broadcast, ADA for live events) mandates that captions be provided but does not effectively enforce accuracy standards, and the economic incentive structure rewards providers who adopt the cheapest automated solution rather than the most accurate one, because the people harmed by poor captions are a small minority with limited market power.
Evidence
Industry accuracy standard is 99% per FCC guidelines. ASR accuracy averages ~80% in real-world conditions (3Play Media, EdAud.org research). At 95% word accuracy, errors occur every 2.5 sentences on average (ASLdeafined, 2025). ChatGPT-3.5 post-processing reduced ASR word error rate from 23.07% to 9.75%, a 57.72% improvement, but still far below 99% (ACM DIS 2024). Mismatch between reported AI performance benchmarks and real-world DHH user experience documented in multiple studies.