Simon/Contribute Data
< Simon
To build a speech recognition system, several types of data files are required:
- A phonetic dictionary to learn how words are pronounced
- Transcribed audio samples to learn how a human pronounces the phonetic elements from the dictionary (phones)
- Large corpora of written text to learn what word structures commonly co-occur (provides context for the recognizer)