Research On Tibetan Speech Recognition Based On Sparse Coding

Posted on:2020-04-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhao

Full Text:PDF

GTID:2438330578964437

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Compared with speech recognition of major languages such as Chinese and English,the study of Tibetan speech recognition started late in 2005,and there are differences among different languages.The adoption of new technology to improve the performance of Tibetan speech recognition system will become an urgent problem in the field of Tibetan speech recognition research.For the recognition system of Tibetan monosyllabic,this paper mainly carried out the following work:1.Feature extraction.The CNN with MFCC as the input can obtain both temporal and spatial information.In the experiment,two kinds of features were extracted,namely,the static and dynamic MFCC.2.Sparse coding.In order to eliminate the correlation between features as far as possible and reduce the information irrelevant to classification,sparse coding was used to obtain the sparse representation of two kinds of MFCCs.Algorithm of Sparse coding used k-svd algorithm.3.Classifier design.The CNN with multidimensional matrix as input can keep the dimension of input data unchanged.In order to capture spatial location features,the CNN was selected as the classifier in this study.4.Tibetan speech recognition system based on sparse coding.In this system,the sparse representation of the MFCC was input into the CNN for the recognition of Tibetan monosyllabic speech.In this study,sparse coding and CNN were combined to improve the performance of speech recognition system.The following conclusions were drawn from the experiment:1.Compared with deep neural network,CNN is more suitable for processing high-dimensional data.2.Dynamic MFCC and sparse coding can improve the performance of Tibetan speech recognition system.3.Tibetan speech recognition system based on sparse coding can be used for Tibetan speech recognition.The main contribution of this study was to combine sparse coding with CNN to form Tibetan speech recognition system based on sparse coding for Tibetan speech recognition.

Keywords/Search Tags:

Tibetan, Speech Recognition, Sparse Coding, CNN, MFCC

PDF Full Text Request

Related items

1	Technology Of Tibetan Speech Recognition Based On Fast Walsh Transform
2	Study On MFCC And Lasso Reverberation Suppression Of Feature Extraction Algorithm Of Speech Recognition
3	Research On Tibetan Speech Recognition Based On Speech Spectral Features
4	Research On Text Dependent Speaker Recognition For Tibetan Amdo Dialect
5	Research On Tibetan Non-specific Continuous Speech Recognition Based On Deep Learning
6	Research On Online Tibetan Speech Recognition System
7	A Study On The Extraction Of Speech Depth In Tibetan Language And Its Speech Recognition
8	Research On Tibetan Lhasa Dialect Speech Recognition
9	Design Of End-to-end Ando Tibetan Speech Recognition System Based On Deep Learning
10	Research On Audio And Video Speech Recognition In Tibetan Lhasa Dialect