Stuttering therapy apps fail because 200ms cloud round-trip latency destroys real-time auditory feedback

ai+20 views
Delayed Auditory Feedback (DAF) therapy for stuttering requires playing back the speaker's own voice with a precisely controlled 50-200ms delay — this is the therapeutic mechanism that helps the brain reorganize speech timing. Cloud-based speech processing adds 200-500ms of network round-trip latency on top of the therapeutic delay, pushing total feedback delay past 400ms where the technique stops working and actually worsens disfluency. Every millisecond of uncontrolled jitter in cloud latency corrupts the therapeutic signal. On-device inference with Gemma 4's native audio processing can maintain sub-20ms processing latency, giving the app precise control over the total delay window. The physics of speech therapy demand that the AI processing happens on the same device as the microphone and speaker — any network hop between them breaks the therapeutic mechanism. This is why no cloud-based stuttering app has matched the efficacy of dedicated $3,000 DAF hardware devices.

Evidence

https://arxiv.org/html/2503.02743v1

Comments