Font Size: a A A

Research On Audio Feature Extraction And Context Recognition Based On Deep Neural Networks

Posted on:2016-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:N F WangFull Text:PDF
GTID:2308330479991057Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As one of the artificial intelligence research orientation, the audio context recognition can apperceive the environment dynamic information according to nearby sounds, that is very important for a further intelligent choice of which the machine to make. Recent years, there are many researchers have focused on this area, yet most of them did their research by the following framework: what the first step is the feature extraction, and nextly is the classification of the pattern, among of them have paid much attention to how to extract the recognition feature which can reflect the acoustic properties of the audio context. The acoustic features can be roughly divided into two main classes: short time feature and long time feature.Short time feature includes the following content: single track mel frequency cepstral coefficients 、 multiple tracks mel frequency cepstral coefficients 、 mel frequency cepstral coefficients and the combination feature of the sparse feature.Most of the short time feature are statistic value of long time audio segment or feature based on the semantic correlation. From the result, we can see that all the above features have its’ shortage, short time feature can not fully describes the acoustic context, and yet long time feature may lack of the description of the inside detail information of the acoustic segment, which is very important for the classification of the audio context. This paper focus on the feature extraction method which aims to find the ones that can reflect not only the long time properties but also the local structural acoustic properties, and proves their efficience for the acoustic context recognition by our experiences.We can get the best suitable features for classification by the deep neural network which can learn by itself. It has been proved in image structural feature analysis, especially the nature image, it can reflect the structural information of images better than people’s subjective analysis. So, we will do the long time feature analysis in audio context spectrogram with the help of feature analysis ability of deep neural network. Contents of this paper is arranged like the following:Firstly we do the research about audio context recognition method based on convolutional neural network, namely CNN. We can do nonlinear mapping on the input data by convolution and down sampling operation of CNN, and we can update the parameters by back propagation of reconstruct errors, so that we can extract the audio features which can describe the audio context properties. The feature analysis and the classification are seemed as two parts of the whole training process, which is supervised so that we have to supply lots of labeled data.In fact, the labeled data is so difficult to get which may cost lots of time andmanual labor that we also introduce another feature analysis method, a method based on the deconvolutional neural networks, namely De CNN. Not only the convolution but also the down sampling operation are stayed in De CNN, what’s more there has some progress than CNN, what content is that the networks parameters update procession is based on the back propagation with respect to the reconstruction error of input data rather than the labels corresponding to the input data. So that, we will do not need so many labeled data. By this way, the unlabeled data is so easy to get, the lack problem of the input labeled data can be solved.Experience result shows that the long time structural feature based on CNN has big progress in the audio context recognition rate than the baseline system. Though the result of De CNN is not so good, yet De CNN solves the lack problem of labeled training data.
Keywords/Search Tags:Audio feature extraction, CNN, DeCNN, Audio context recognition
PDF Full Text Request
Related items