Skip to content

cisco/multilingual-speech-testing

Repository files navigation

Multilingual Speech Testing

This repository is a collection of real-world speech recordings and tools to facilitate subjective testing and analysis. The datasets are designed for speech quality and intellgibility assessment purposes, including subjective tests in laboratory or crowdsourced settings.

Speech Intelligibility Assessment

We provide audio files and support code for the Diagnostic Rhyme Test (DRT) to assess speech intelligibility in several languages.

Crowdsourced MUSHRA Testing

We share helper code to set up crowdsourced MUSHRA tests (forthcoming). The test design and participant screening follows ITU recommendations and has been adapted to the crowdsourced setting. The flexible test setup can be performed with any suitable high-quality clean speech test data. The release of this tool will be timed with our paper presentation at INTERSPEECH 2025 in Rotterdam.

Acknowledgements

We are extremely greatful for the extensive support that our colleagues and collaborators have provided in the realisation of these multilingual speech testing resources. Specifically, we would like to thank Nerio Morán Páez, Miguel Plaza Rosillon, Ginette Leon Prato, Daniel Arismendi, Shirley Pestana Rodriguez, Omid Roshani, Jose Kordahi, Cyprian Wronka, Anna Bartlett, and Ana Rivera Jaramillo. We appreciate their meticulous attention to detail and relentless dedication that was essential to getting this project off the ground.

Licensing

Unless stated otherwise, these multilingual speech datasets are licensed under a CC BY-SA 4.0

About

Test software and data for evaluation of speech processing algorithms in multiple languages

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors 4

  •  
  •  
  •  
  •  

Languages