As a tonal language,tone of Lhasa dialect is essential to help to discriminate homophones.However,it remains controversial how many tones Lhasa dialect has.This uncertainty brings difficulty to utilize tonal information in ASR of Lhasa dialect.In this study,we adopted a four-tone pattern and designed a phone set based on the four contour contrasts scheme.In the feature level,we have tried different pitch trackers to estimate the fundamental frequency for each frame and then combined fundamental frequencies with MFCC features to train acoustic models.To test whether tonal features are useful,we created a small-scale corpus and built the ASR system for Lhasa dialect from scratch.We used different phoneme sets and acoustic features to train the DNN-HMM model.The experimental results showed that both tonal phoneme set and tonal acoustic features can improve the system performance.When we used tonal phoneme set,the relative performance improvement by using two different pitch trackers were separately 11.1% and 7.9%.When we combined these different acoustic models,the relative performance improvement was 16.0%.This preliminary study reveals that the tonal information plays an important role in speech recognition of Tibetan Lhasa dialect. |