Pub. online:1 Jan 2010Type:Research ArticleOpen Access
Volume 21, Issue 3 (2010), pp. 361–374
The paper deals with the use of formant features in dynamic time warping based speech recognition. These features can be simply visualized and give a new insight into understanding the reasons of speech recognition errors. The formant feature extraction method, based on the singular prediction polynomials, has been applied in recognition of isolated words. However, the speech recognition performance depends on the order of singular prediction polynomials, whether symmetric or antisymmetric singular prediction polynomials are used for recognition and as well on the fact even or odd order of these polynomials is chosen. Also, it is important to know how informative separate formants are, how the speech recognition results depend on other parameters of the recognition system such as: analysis frame length, number of the formants used in recognition, frequency scale used for representation of formant features, and the preemphasis filter parameters. Properly choosing the processing parameters, it is possible to optimize the speech recognition performance.
The aim of our current investigation is to optimize formant feature based isolated word recognition performance by varying processing parameters of the recognition system as well as to find improvements of the recognition system which could make it more robust to white noise. The optimization experiments were carried out using speech records of 111 Lithuanian words. The speech signals were recorded in the conventional room environment (SNR = 30 dB). Then the white noise was generated at a predefined level (65 dB, 60 dB and 55 dB) and added to the test utterances. The recognition performance was evaluated at various noise levels.
The optimization experiments allowed us to improve considerably the performance of the formant feature based speech recognition system and made the system more robust to white noise.
Pub. online:1 Jan 2008Type:Research ArticleOpen Access
Volume 19, Issue 2 (2008), pp. 213–226
A possibility to use the formant features (FF) in the user-dependent isolated word recognition has been investigated. The word recognition was performed using a dynamic time-warping technique. Several methods of the formant feature extraction were compared and a method based on the singular prediction polynomials has been proposed for the recognition of isolated words. Recognition performance of the proposed method was compared to that of the linear prediction coding (LPC) and LPC-derived cepstral features (LPCC). In total, 111 Lithuanian words were used in the recognition experiment. The recognition performance was evaluated at various noise levels. The experiments have shown that the formant features calculated from the singular prediction polynomials are more reliable than the LPC and LPCC features at all noise levels.
Pub. online:1 Jan 2002Type:Research ArticleOpen Access
Volume 13, Issue 1 (2002), pp. 37–46
The isolated word speech recognition system based on dynamic time warping (DTW) has been developed. Speaker adaptation is performed using speaker recognition techniques. Vector quantization is used to create reference templates for speaker recognition. Linear predictive coding (LPC) parameters are used as features for recognition. Performance is evaluated using 12 words of Lithuanian language pronounced ten times by ten speakers.