This repository is a collection of real-world speech recordings and tools to facilitate subjective testing and analysis. The datasets are designed for speech quality and intellgibility assessment purposes, including subjective tests in laboratory or crowdsourced settings.
We provide audio files and support code for the Diagnostic Rhyme Test (DRT) to assess speech intelligibility in several languages.
We share helper code to set up crowdsourced MUSHRA tests (forthcoming). The test design and participant screening follows ITU recommendations and has been adapted to the crowdsourced setting. The flexible test setup can be performed with any suitable high-quality clean speech test data. The release of this tool will be timed with our paper presentation at INTERSPEECH 2025 in Rotterdam.
We are extremely greatful for the extensive support that our colleagues and collaborators have provided in the realisation of these multilingual speech testing resources. Specifically, we would like to thank Nerio Morán Páez, Miguel Plaza Rosillon, Ginette Leon Prato, Daniel Arismendi, Shirley Pestana Rodriguez, Omid Roshani, Jose Kordahi, Cyprian Wronka, Anna Bartlett, and Ana Rivera Jaramillo. We appreciate their meticulous attention to detail and relentless dedication that was essential to getting this project off the ground.
Unless stated otherwise, these multilingual speech datasets are licensed under a CC BY-SA 4.0