Research On Speech Detection Based On LSTM Network And GMM

Posted on:2020-03-13

Degree:Master

Type:Thesis

Country:China

Candidate:H Z Zheng

Full Text:PDF

GTID:2428330575459197

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Crying is a unique language of infants and an important way for infants to transmit information.There are abundant psychological and physiological needs in infant crying,therefore the study of infant crying can help people understand the meaning of crying,and then can realize better caring of infants.The subject of this paper comes from the analysis demand for infant crying of a company.The company intends to collect a large number of infant crying data for infant crying analysis,but there are always adult speech contained in the collected crying data,for privacy protection consideration,the company needs to detect adult speech in the audio stream of infant crying and then remove it effectively.According to the company's needs,this paper studies speech detection based on LSTM network and GMM model and LSTM-GMM-RNN model respectively,the study aims at recognizing adult speech from audio stream,which has great practical significance for the protection of user privacy.Taking the infant crying analysis as the research background,this paper focuses on the privacy protection problem during infant crying data collecting,and carries out adult speech detecing study,the detailed research work includes:1)Analyze the company's infant crying database and adult speech database through time-domain waveform and spectrogram;summarize the signal differences between infant crying and adult speech through listening to the audio streams that contain both infant crying and adult speech,and analyze the audio features that are discriminative in distinguishing infant crying from adult speech.2)Five groups of feature set are extracted as audio features,including MFCC,MFCC+energy,MFCC+pitch,PLP and PLP+energy.A deep neural network with two layers of LSTM network structure is constructed and is used as the classification model;speech detection experiments were carried out based on each group of feature set.3)Three different speech detection schemes are constructed based on GMM model: speech detection based on infant crying GMM model,speech detection based on adult speech GMM model,speech detection based on the combination of infant crying GMM model and adult speech GMM model.4)In order to further improve the accuracy of speech detection,the recognition results of RNN network combined with LSTM network and GMM model are proposed for classification and recognition.A speech detection algorithm based on LSTM-GMM-RNN model is proposed.Compared with the detection algorithm based on LSTM network and GMM model,the accuracy of speech detection of this algorithm is greatly improved.The proposed speech detection algorithm based on LSTM network and the proposed speech detection algorithm based on GMM model and a speech detection algorithm based on LSTM-GMM-RNN can detect adult speech from the infant crying audio streams well.After removing adult speech from the infant audio streams,the proposed algorithms can well realize the privacy protection in the data collecting procedure.

Keywords/Search Tags:

Speech detection, Infant crying analysis, LSTM, GMM, RNN

PDF Full Text Request

Related items

1	Design And Implementation Of Infant Crying Detection And Alarm System
2	Analysis Of Infant Crying Sentiment Based On Deep Learning
3	Infant Crying Detection Under Monitoring Scenario
4	Researches And Be Implemented On Speaker Sensibilities Analyze And Algorithms
5	Research On Infant Emotional Information Recognition Method Based On Infant Cry Detection
6	Analysis Of Crying Cause By Baby Cryingalgorithm
7	Study Of Artificial Intelligence Flight Co-Pilot Speech Recognition Technology
8	The Feature Analysis And Recognition Of Infant Cry
9	Study On Intelligent Detection Of Synthetic Speech Based On Cepstral Coefficient
10	Research On Toxicity Detection Of Internet Speech Based On Deep Learning