martes, 7 de marzo de 2017

The English Consistent Confusion Corpus

The English Consistent Confusion Corpus is a large-scale collection of noise induced British English speech misperceptions. These misperceptions have been elicited by asking listeners to transcribe English words mixed with complex noise backgrounds.

The corpus has been distilled from over 300,000 listener responses and includes responses to over 9,000 individual noisy speech tokens. Of these, more than 3,000 passed the condition of minimal consistency where at least 6 of the listeners reported the same incorrect response.


The ECCC can be downloaded as a zip file. This work is licensed under a Creative Commons Attribution 4.0 International License.

More information on:

No hay comentarios:

Gadget de animacion Social - Widgets para Blogger