Development of HMM/Neural Network‐Based Medium‐Vocabulary Isolated‐Word Lithuanian Speech Recognition System
Volume 15, Issue 4 (2004), pp. 465–474
Pub. online: 1 January 2004
Type: Research Article
Received
1 December 2004
1 December 2004
Published
1 January 2004
1 January 2004
Abstract
The development of Lithuanian HMM/ANN speech recognition system, which combines artificial neural networks (ANNs) and hidden Markov models (HMMs), is described in this paper. A hybrid HMM/ANN architecture was applied in the system. In this architecture, a fully connected three‐layer neural network (a multi‐layer perceptron) is trained by conventional stochastic back‐propagation algorithm to estimate the probability of 115 context‐independent phonetic categories and during recognition it is used as a state output probability estimator. The hybrid HMM/ANN speech recognition system based on Mel Frequency Cepstral Coefficients (MFCC) was developed using CSLU Toolkit. The system was tested on the VDU isolated‐word Lithuanian speech corpus and evaluated on a speaker‐independent ∼750 distinct isolated‐word recognition task. The word recognition accuracy obtained was about 86.7%.