This page contains the download links to the Lip Reading in the Wild (LRW) dataset, described in [1].
The dataset consists of up to 1000 utterances of 500 different words,
spoken by hundreds of different speakers.
All videos are 29 frames (1.16 seconds) in length, and the word occurs in the middle of the video.
The word duration is given in the metadata, from which you can determine the start and end frames.
The dataset statistics are given in the table below. The full list of classes in the dataset is given
here.
Set | Dates | # class | # per class |
Train | 01/01/2010 - 31/08/2015 | 500 | 800-1000 |
Validation | 01/09/2015 - 24/12/2015 | 500 | 50 |
Test | 01/01/2016 - 30/09/2016 | 500 | 50 |
An example video and the corresponding metadata can be found in the link below. Please note that your web browser may not play the mp4 file correctly.
Example mp4 videoVisualisation of video clips for selected words
(with thanks to Donglai Wei for providing these)
The package including the videos and the metadata is available for non-commercial, academic research. You will need to sign a Data Sharing agreement with BBC Research & Development before getting access. To download a copy of the agreement please go to the BBC Lip Reading in the Wild and Lip Reading Sentences in the Wild Datasets page. Once approved, you will be supplied with a password, and the package can then be downloaded below. Please cite [1] below if you make use of the dataset.
For all technical questions, please contact the author of [1].
File | MD5 Checksum | |
Part A | Download | 474f255cdf6da35f41824d2b8a00d076 |
Part B | Download | ef03d6ab52d14de38db23365e2e09308 |
Part C | Download | 532343bbb5f14ab14623c5cce5c8b930 |
Part D | Download | 78709823e18c3906e49b99536c5343de |
Part E | Download | abb5fcf3480f2899d09d0171b716026f |
Part F | Download | b311feea9705533350a030811501f859 |
Part G | Download | 37e525220e8d47bc7b8bee4753131390 |