Research On Speech Recognition Technology And Application Of Local Dialect In Datong,Shanxi

Posted on:2021-02-26

Degree:Master

Type:Thesis

Country:China

Candidate:X F Liu

Full Text:PDF

GTID:2428330602465446

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Datong City is located in the northernmost part of Shanxi Province,and its local dialects are an important part of the Jin dialect.The language features of this area are less complex than the dialects in central and southern Shanxi.The research on the speech recognition of the regional dialects can lay a good technical foundation for the research on the speech recognition technology of Shanxi local dialects.This paper first introduces the language characteristics of the Datong dialect and the construction process of the Datong dialect speech data set,which will be applied to the training of the Datong dialect speech recognition model.Datong dialect and Mandarin have great differences in grammar,pronunciation,etc.Compared with Mandarin,they have a more "into" tone.Due to the short sounding of “into tone”,the duration of the audio is shorter,the duration of the audio is shorter,so the spectral range of “into tone” feature in the spectrogram is smaller,making the spectral representation of the voice more complicated.In response to this problem,combined with the structural characteristics of convolutional neural networks,this paper proposes a "multi-core convolutional fusion network(MCFN)" to extract phoneme features of different durations in the spectrogram.This structure can be added before the acoustic model to improve its robustness.This structure can be added before the acoustic model to enhance the robustness of the acoustic model.Besides,this paper also combines the attention mechanism to build an end-to-end Datong dialect speech translation model.The model treats Datong dialect and Mandarin as two different languages.By inputting the speech signal features of the Datong dialect into the end-to-end speech translation model and mapping them into high-dimensional features,and then forming a corresponding relationship with the Chinese Mandarin text,the result is output.This model can directly connect the dialect speech with the Mandarin text,without the dialect text as a transition,reducing the negative impact of the dialect text quality on the model.MCFN and end-to-end speech translation models work together to complete the task of converting Datong dialect speech into Mandarin text,and experiments have proved to be good.The research on the speech recognition technology of Datong dialect can not only broaden the group of speech recognition users,and facilitate the human-computer interaction activities of users with serious accents,but also can be applied to the fields of identity authentication and medical auxiliary diagnosis.Besides,this subject is of great significance to protect the intangible cultural heritage of Shanxi local dialects and enhance barrier-free language communications across the country.

Keywords/Search Tags:

Datong Dialect, Speech Recognition, MCFN, Attention

PDF Full Text Request

Related items

1	Speech Emotion Recognition Of Datong Dialect Based On Deep Learning
2	End-to-end Dialect Speech Recognition Based On Weighted Sparse Attention Mechanism
3	Application Research Of Deep Learning In Speech Recognition Of Sichuan Dialect
4	Speech Recognition Of Hainan Dialect Based On Deep Learning
5	Speech Enhancement Method Fortibetan Speech Recognition In Lhasa Dialect
6	Research On Yangzhou Dialect Speech Recognition Based On Isolated Words
7	Research On Tibetan Lhasa Dialect Speech Recognition
8	Research On Dialect Accent Classification Based On Deep Learning
9	Research On Human-machine Interaction System For Automatic Speech Recognition In Xiangyang Dialect
10	Research On End-to-end Tibetan Speech Recognition Based On Deep Learnin