Lip Reading in the Wild (LRW)

Published:
January 27, 2023

The dataset consists of up to 100 utterances of 500 different words, spoken by hundreds of different speakers. All videos are 29 frames (1.16 seconds) in length, and the word occurs in the middle of the video. The word duration is given in the metadata, from which you can determine the start and end frames.

Dataset Technical Specification

Number of files:
1000
Total dataset size:
Duration:
Format:
wav
Sample rate:
Resolution:

Dataset Demographics

Country:
Worldwide
Gender:
M/F 50-50%
Age:
18-55
Number of participants:
100