Application Of MFCC Based On Wavelet Packet Decomposition In Sound Recognition In Complex Environment

Posted on:2020-06-16

Degree:Master

Type:Thesis

Country:China

Candidate:S H Yu

Full Text:PDF

GTID:2428330578958864

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Nowadays,with the rapid development of artificial intelligence,voice recognition,as one of the technologies,is naturally becoming mature.However,due to the noise interference in some complex environments and the complexity of its own composition,it is more difficult to recognize.In some complex scenes(such as cities),the voice generally contains a lot of information.How to recognize this part of the voice becomes a problem that needs to be overcome.In complex scenes,the face of a variety of voices is a great challenge to the traditional voice recognition model.In this paper,the recognition and classification of children's playful noise,dog bark,sea wave,whistle,chain saw and electric drill sound in the acoustic data set of Google are adopted.This paper mainly adopts the mature template matching method at present to recognize the sound by the steps of pretreatment,feature extraction,model classification and recognition.In the pretreatment aspect,the energy detection technology in communication technology is used.In the aspect of feature extraction,a MFCC feature extraction method based on wavelet packet decomposition is proposed.In the aspect of model classification,the convolutional neural network model is improved to input the feature map of one-dimensional sound signal,which reduces the computing time.This paper mainly does the following work:(1)In the aspect of pretreatment,we compare the single-node spectrum sensing technology,matched filter detection and energy detection in the single-node spectrum sensing algorithm of signal detection with the signal characteristics of sound signal,including the difficulty,simplicity and advantages and disadvantages not limited to the three,and finally use energy detection to pretreat.The pre-processing method of energy detection takes sound as signal processing.Firstly,the target signal is input into the filter,and the energy generated by the input signal in this period is obtained by modular square andcumulative operation of the target model through the filter.Then,the ratio of the energy to the variance of noise is calculated,and the threshold value is compared with the set value and filtered.Finally,the filtered memory is obtained.Target signals in useful signals.(2)A feature extraction method based on Meier cepstrum coefficients and wavelet packet decomposition is adopted.Traditional Mel cepstrum coefficients simulate human auditory system and play a very good role in conventional acoustic recognition.But for some special scenarios,its stability and anti-noise performance are not satisfactory,so in this paper,it is transformed with wavelet packet and replaced with Fourier transform in traditional MFCC.Firstly,the target signal is divided into frames and windowed.Then,it is decomposed by wavelet packet transform combined with Mel scale.After normalization,the target is obtained by logarithmic operation and discrete cosine transform.The characteristic parameters of sound signal can not only imitate the recognition ability of human ear,but also have certain anti-noise ability in complex environment.(3)To optimize the convolution neural network based on the one-dimensional characteristics of sound signal,change and adjust its structure,and compare it with the traditional voice recognition model,and explore how different sampling methods will change the recognition rate in complex environment.The experimental results show that the recognition rate of the proposed model method is higher than that of the traditional voice recognition model,and the maximum sampling method can retain more features of the target signal in noisy environment than the mean sampling method,thus achieving better recognition results.

Keywords/Search Tags:

Convolutional Neural Network, Sound Recognition, Wavelet Packet Decomposition, Energy Detection, Mel Cepstrum Coefficien

PDF Full Text Request

Related items

1	Sound Event Recognition Based On Frequency Band Decomposition
2	Sound Recognition Based On Energy Detection For Complex Environments
3	Deep Learning Based Sound Recognition Classification System
4	Speaker Rcognition Using Wavelet Packet Decomposition Based On MPEG-I
5	Robust Sound Recognition Based On Brain-Inspired Computing Method
6	Study Of Face Recognition Based On Wavelet Packet And Wavelet Multi-level Linear Subspace Feature Extraction Algorithm
7	Research And Development Of Defect Detection System Of Lifting Motor Based On Audio Signal
8	Study Of Speech Recognition Algorithm Based On Modified LP Cepstrum And Neural Network
9	Study On Accurate Face Detection And Multi-Resolution Face Recognition
10	Hmm And Wavelet Neural Network-based Speech Recognition System