Google announced that ‘Live Caption’ was now available in English on Pixel 4 , & will soon be available on Pixel 3 and other Android devices.
Live Caption, a new Android feature that automatically captions media playing on your phone. The captioning happens in real time, completely on-device, without using network resources, thus preserving privacy & lowering latency.
Google Live Caption works through a combination of 3 on-device deep learning models: a recurrent neural network (RNN) sequence transduction model for speech recognition (RNN-T), a text-based recurrent neural network model for unspoken punctuation, & a convolutional neural network (CNN) model for sound events classification.
Google explained that Live Caption integrates the signal from the 3 models to create a single caption track, where sound event tags appear without interrupting the flow of speech recognition results. Punctuation symbols are predicted while text is updated in parallel.