Real-Time Vocal Tract Length Normalization in a Phonological Awareness Teaching System

Dénes Paczolay¹, András Kocsor², László Tóth³

Speaker normalization in a speech recognition can significantly improve speech recognition accuracy. One such method, vocal tract length normalization (VTLN), is especially useful when the system has to work reliably for males, females and children. It is just this situation with our phonological awareness teaching system, the "SpeechMaster", which aims at real-time phoneme recognition and feedback. As most VTLN algorithms work off-line, this poses the additional problem of real-time operation. This paper examines how a well-known off-line algorithm can be approximated on-line by machine learning regression techniques. We conclude that, by employing a real-time estimation of VTLN parameters, the recognition error can be reduced by some 14-24 %.

Real-Time Vocal Tract Length Normalization in a Phonological Awareness Teaching System

Dénes Paczolay1, András Kocsor2, László Tóth3

Dénes Paczolay¹, András Kocsor², László Tóth³