Research On Continuous Speech Recognition Based On Deep Learning

Posted on:2021-03-22

Degree:Master

Type:Thesis

Country:China

Candidate:D F Shen

Full Text:PDF

GTID:2518306512987369

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Since the 21 st century,with the rapid development of computer technology and artificial intelligence,the communication between human and machine is no longer limited to the input and output of text symbols.With speech recognition technology,machines can easily understand what people say,and even talk to people smoothly.Therefore,the research on speech recognition technology,especially continuous speech recognition technology,has become a hot spot.This paper constructs a continuous speech recognition system through the realization of three modules: the auto-segmentation of continuous speech,acoustic model and language model.The main tasks are as follows:(1)Research on the auto-segmentation continuous speech.This paper analyzes the features of speech signal and selects the appropriate features of time domain,frequency domain and cepstral domain as the basis for segmentation.First,the sound segments in the continuous speech are found by endpoint detection.Then we can find out the voiced segments in the sound segments by pitch period trajectory detection.The voiced segments can be subtracted from the sound segments to get the consonant segments,and the consonant is the mark of the beginning of syllable.Finally,because the energy of the different frequency bands of speech signal is different,this paper divides the spectrogram into 5 frequency bands and count the energy changes to achieve the segmentation of continuous vowel syllable and the segmentation of complex vowel syllable and consonant syllable.The experimental results show that this method has a better segmentation effect.(2)In this paper,an acoustic model based on Hidden Markov model and an acoustic model based on deep learning are constructed.The 24-dimensional Mel-frequency cepstral coefficients of speech signal are extracted for training,and the same speech database is used for testing.Then we compare the recognition accuracy and performance of several acoustic models.The experiments show that the acoustic model based on the bidirectional short-term memory model has achieved a high recognition rate.(3)This paper constructs a language model based on N-grams,realizes the syllable-tocharacter conversion,and analyzes the advantages and disadvantages of the model.At the same time,in order to improve the fault tolerance of the entire speech recognition system,other application experiments of the language model are carried out,and good results have been achieved in text filling and text error correction.

Keywords/Search Tags:

Speech recognition, Speech segmentation, Deep learning, Acoustic model, BLSTM, Language model

PDF Full Text Request

Related items

1	Design And Implementation Of Intelligent Speech Interaction
2	A Study On The Extraction Of Speech Depth In Tibetan Language And Its Speech Recognition
3	Research On Uyghur Speech Recognition Based On Deep Learning
4	Research On Adaptation Methods In Deep Learning Based Speech Recognition Systems
5	Air Traffic Control Speech Recognition Based On Deep Learning
6	Research On Acoustic Model Of Speech Recognition In Educational Scene Based On Deep Learning
7	Development Of Offline Speech Recognition System Based On Deep Learning
8	Research On Embedded Speech Recognition System Based On Deep Learning
9	Research And Implementation Of Mongolian-Chinese Mixed Language Speech Recognition System Based On Deep Learning
10	Research On Amdo Tibetan Speech Recognition Technology Based On Deep Learning