YouTube has introduced Expressive Captions on Android, an enhancement to Live Caption that adds emotional context to captions by conveying how something is said, not just the words. This update addresses the longstanding limitation of captions, which have changed little since their popularization in the 1970s and often miss nuances like tone, emphasis, and personality. With 70% of Gen Z regularly using captions, especially in noisy environments or on the go, Expressive Captions improve accessibility and engagement by reflecting speech intensity, vocal sounds, and ambient noises.
Features of Expressive Captions
- All CAPs: Uses capitalization to show speech intensity, e.g., “HAPPY BIRTHDAY!” to indicate excitement.
- Vocal bursts: Identifies human sounds such as sighing, grunting, and gasping to express tone.
- Ambient sound: Labels background noises like applause and cheers to provide environmental context.
Integration and Technology
Expressive Captions are built into the Android operating system and available across apps, including livestreams, social media, Google Photos videos, and messages. They operate in real time on-device, allowing use even offline or in airplane mode. Developed by Android and Google DeepMind teams, the feature uses multiple AI models to translate spoken words into stylized captions and label a broader range of background sounds, making captions as expressive as audio.
Availability
Starting December 5, 2024, Expressive Captions are available in the U.S. in English on Android devices running Android 14 or higher with Live Caption enabled. This feature enhances accessibility by bringing emotional expression and richer context to captions, supporting both people with disabilities and general users.







