Identification Of Spoken Language From FM Broadcast Using Deep Learning

Posted on:2020-03-30

Degree:Master

Type:Thesis

Country:China

Candidate:D Zhu

Full Text:PDF

GTID:2428330575989331

Subject:Signal and Information Processing

Abstract/Summary:

With the rapid development of social economy and the acceleration of globalization,the mobility of people around the world has increased the opportunities for people with different linguistic backgrounds to communicate with each other.Automatic language recognition,as the first step in speech recognition,is very important.The rapid development of global artificial intelligence has also promoted the upgrading of technologies.As a bridge technology of human information exchange,voice technology has attracted more and more researchers to work in the realization of good voice interaction.And the security of radio communications in border areas can also be monitored by means of speech recognition;It is great importance to achieve fast and precise language recognition for all subsequent work related to speech recognition.This paper focuses on the recognition of the phonetic language of the broadcast,and discusses the language recognition method in detail.The main research can be summarized as:1)According to the requirement of data set in the field of language recognition,data sets of Lao,Putonghua,Burmese,Thai and Vietnamese for about 25 hours were collected,and the reliability of data was confirmed by comparing with other data sets.2)Combined with the method of voice processing,the broadcast signal identification data set is established,and the signal/non-signal identification of the FM broadcast signal are analyzed by deep learning.3)A reliable baseline system for language recognition is established by using I-Vector method,which provides a reliable theoretical basis for the improvement of subsequent experiments.4)Based on the deep neural network,two end-to-end language recognition methods using acoustic features as input are designed for short-time variable speech signals.One is language recognition based on Gated Recurrent Unit(GRU).In this paper,the network structure with different parameters and the performance of different acoustic characteristics in three data sets are analyzed,the appropriate network parameters and structures are determined,and the characteristics suitable for the use of the deep learning network are found out.Another model that combines self-attention and deep convolutional neural networks(DCNN)to analyze the use of variable length speech language recognition.The paper compares the difference between the traditional acoustic feature model and the end-to-end model.The results show that the end-to-end method can achieve better recognition results than using I-Vector.

Keywords/Search Tags:

signal detection, I-Vector, deep convolutional neural networks, self-attention, language identification

Related items

1	Research On Speech Language Identification Based On Deep Learning Network
2	Research On Pedestrian Re-identification Based On Deep Learning
3	Deep Learning-Based Methods For Text Detection And Recognition In Natural Images
4	Research On Recognition Algorithm Of Radar Emitter Signal Based On Deep Neural Networks
5	Research On ECG Signal Identification Based On Convolutional Neural Network
6	Language Identification Based On Convolutional Neural Network
7	Research On Key Technologies Of Image Saliency Detection Based On Deep Neural Networks
8	Research On Deep Learning Based Antenna Selection And Signal Detection In DM-GSM System
9	Research Of Multiobject Detection And Tracking Via Deep Convolutional Neural Network
10	Research On Offline Writer Identification Based On Deep Convolutional Neural Networks