Font Size: a A A

Research And Application Of Audio Keyword Recognition Technology Based On Generative Adversarial Network

Posted on:2021-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y M TianFull Text:PDF
GTID:2428330620464040Subject:Engineering
Abstract/Summary:PDF Full Text Request
Keyword recognition refers to the detection of predefined keywords in a continuous speech stream.Due to the breakthrough development of deep neural networks in speech recognition,the research of keyword recognition is mainly based on speech recognition.This type of method uses acoustic models and language models to decode speech signals into text sequences,and uses text search methods to search for keywords.Although this method has ability to detect keywords,it has following problems:1.Keyword recognition accuracy affected by speech recognition and text search methods.2.Unable to detect the language without character.This method needs to transcribe the speech into text sequence,which is not applicable to languages without character,such as dialects,minority languages.3.Unable to get timing information for keyword.The timing information of the keywords is lost after the speech is converted into text.Aiming at question 2 and question 3,the thesis designs a keyword recognition method that has ability to identify keywords without text language and obtain the time information of the keywords in the speech.This thesis uses generative adversarial networks for keyword recognition and proposes an audio keyword recognition method based on GAN.In this method,the Mel Frequency Cepstral Coefficients are extracted and directly input to generator.The generator obtains the characteristics of keywords and output keyword timing information.The discriminative network in GAN plays a supervisory role,which make the sequence of the generator closer to the label sequence marked manually.In order to obtain the position information of the keywords in the speech,the algorithm defines a loss function,which ensures that the generated mask sequence can detect the keywords and obtain the position information of the keywords.The experiment found that the audio keyword recognition algorithm based on GAN has the ability to identify keywords without speech being converted to text.On the self-made data set,the accuracy of the algorithm can reach 80%.This thesis uses the proposed algorithm to design and implement an audio keyword detection system.It includes online recording,model training,keyword detection,and results viewing functions.After the user selects the model,the keywords are identified by online recording,and the system feeds the recognition results back to the interface for user to view.In addition,the system has a keyword blocking function.When the system detects keywords,it masks the keywords in the speech with noise,so as to achieve the purpose of protecting sensitive and private information.
Keywords/Search Tags:deep learning, generative adversarial networks, audio keyword recognition, keyword targeting
PDF Full Text Request
Related items