|
A Comparative Study of Several
Feature Transformation and Learning Methods for Phoneme Classification
Kocsor, A., Tóth, L., Kuba, A. Jr., Kovács,
K., Jelasity, M., Gyimóthy, T., Csirik, J.
This paper examines the applicability of some learning techniques
for speech recognition, more precisely, for the classification
of phonemes represented by a particular segment model. The methods
compared were TiMBL (the IB1 algorithm), C4.5 (ID3 tree learning),
OC1 (oblique tree learning), artificial neural nets (ANN), Gaussian
mixture modeling (GMM) and, as a reference, an HMM recognizer
was also trained on the same corpus. Before feeding them into
the learners, the segmental features were additionally transformed
using either linear discriminant analysis (LDA), principal component
analysis (PCA) or independent component analysis (ICA). Each learner
was tested with each transformation in order to find the best
combination. Furthermore, we experimented with several feature
sets such as filter-bank energies, mel-frequency cepstral coefficients
(MFCC) and gravity centers. We found LDA helped all the learners,
in several cases quite considerably. PCA was beneficial only for
some of the algorithms, while ICA improved the results quite rarely,
and was bad for certain learning methods. From the learning viewpoint
ANN was the most effective, and attained the same results independently
of the transformation applied. GMM behaved worse, which shows
the advantages of discriminative over generative learning. TiMBL
produced reasonable results, while C4.5 and OC1 could not compete,
no matter what transformation was tried.
|
|