Research And Application Of Audio Keyword Recognition Technology Based On Generative Adversarial Network

Posted on:2021-03-20

Degree:Master

Type:Thesis

Country:China

Candidate:Y M Tian

Full Text:PDF

GTID:2428330620464040

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Keyword recognition refers to the detection of predefined keywords in a continuous speech stream.Due to the breakthrough development of deep neural networks in speech recognition,the research of keyword recognition is mainly based on speech recognition.This type of method uses acoustic models and language models to decode speech signals into text sequences,and uses text search methods to search for keywords.Although this method has ability to detect keywords,it has following problems:1.Keyword recognition accuracy affected by speech recognition and text search methods.2.Unable to detect the language without character.This method needs to transcribe the speech into text sequence,which is not applicable to languages without character,such as dialects,minority languages.3.Unable to get timing information for keyword.The timing information of the keywords is lost after the speech is converted into text.Aiming at question 2 and question 3,the thesis designs a keyword recognition method that has ability to identify keywords without text language and obtain the time information of the keywords in the speech.This thesis uses generative adversarial networks for keyword recognition and proposes an audio keyword recognition method based on GAN.In this method,the Mel Frequency Cepstral Coefficients are extracted and directly input to generator.The generator obtains the characteristics of keywords and output keyword timing information.The discriminative network in GAN plays a supervisory role,which make the sequence of the generator closer to the label sequence marked manually.In order to obtain the position information of the keywords in the speech,the algorithm defines a loss function,which ensures that the generated mask sequence can detect the keywords and obtain the position information of the keywords.The experiment found that the audio keyword recognition algorithm based on GAN has the ability to identify keywords without speech being converted to text.On the self-made data set,the accuracy of the algorithm can reach 80%.This thesis uses the proposed algorithm to design and implement an audio keyword detection system.It includes online recording,model training,keyword detection,and results viewing functions.After the user selects the model,the keywords are identified by online recording,and the system feeds the recognition results back to the interface for user to view.In addition,the system has a keyword blocking function.When the system detects keywords,it masks the keywords in the speech with noise,so as to achieve the purpose of protecting sensitive and private information.

Keywords/Search Tags:

deep learning, generative adversarial networks, audio keyword recognition, keyword targeting

PDF Full Text Request

Related items

1	Research On Adversarial Sample Detection Based On Speech Recognition System
2	Research And Implementation Of Keyword Spotting System With Large Keyword Table In Spontaneous Speech
3	Research On Keyword Extraction Algorithms Based On Semantic Features
4	Research And Applications Of Speech Keyword Recognition Technology
5	Research And Application Of Image Recognition Method Based On Deep Generative Adversarial Networks
6	Underwater Target Recognition Based On Generative Adversarial Networks
7	Reasearch Of Deep Learning In Audio Signal Processing
8	Research And Application Of Facial Expression Recognition Based On Deep Generative Adversarial Learning Technology
9	Research On Keyword Recognition Based On Query By Example
10	Feature Learning Methods Based On Deep Generative Networks