Resources for publicly available machine learning datasets, including speech

Purpose:

To collect resources of freely available datasets for machine learning (with emphasis on speech). Links are working as of January 2019.

Read English: LibriSpeech
Various spoken American English sentences (with various dialects) with good annotations: TIMIT corpus (partial)
Collection of variety of languages, voluntarily collected world-wide: Voxforge
YouTube videos (can use audio and/or visuals) of celebrities (English): VoxCeleb.
Variety of languages: Arctic2 Database
Datasets/ databases for a variety of research purposes/ algorithms: The Centre for Speech Technology Research
Corpus of Spoken Dutch
Annotated and with natural sounds (laughs, coughs, etc): Meeting Recorder Dialog Act Database
LibriVox: free public domain audiobooks
Free Spoken Digit Dataset
Two types of north American dialects speaking Harvard sentences (I believe): PN/NC Corpus

Various disorders, as well as healthy (German): Saarbrücker Voice Database
Children with Specific Language Impairment (Czech): LANNA database
Interviews of people with clinical disorders (anxiety, depression, post-traumatic stress disorder): DAIC-WOZ Database

German: EmoDB
English: The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)

Object class recogntion: PASCAL VOC. One needs to register but should be able to download for free
Object class recogntion: COCO (common objects in context)
IMAGENET: “an image database organized according to the WordNet hierarchy”.