Font Size: a A A

Research On Bird Sound Recognition Method Based On Multi-Feature Fusion

Posted on:2023-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:T Y WuFull Text:PDF
GTID:2530306905999979Subject:Engineering
Abstract/Summary:PDF Full Text Request
Birds add life,sound and color to people’s life.At the same time,birds are considered to be one of the most important indicators of environmental conditions.The change in the number of birds is often the first sign of environmental problems.Birds,as part of the natural balance,have both ecological and economic value.Identifying bird species through species identification technology can be of great help to the protection of birds,Therefore,the identification of bird species by image or sound has become one of the research focuses in recent years.Facing the difficulties of bird image acquisition,the difficulty of bird sound features representing the differences between bird species and the poor recognition accuracy of a single classification model,this paper makes an in-depth study on the sound processing,feature extraction,classification and recognition of bird sound by using the principle of bird sound recognition,and explores four classification frameworks to classify bird sound.The specific work is as follows:1.The acoustic feature-based classification method.In order to extract acoustic features,a Bark-scale acoustic spectrogram was first created,based on which Statistical Spectrum Descriptors(SSD)and Rhythm Patterns(RP)were extracted,in addition to Mel-Frequency Cepstral Coefficient(MFCC)features were extracted,and these acoustic features were input into the Support Vector Machines(SVM)classifier for training and testing as a baseline method,in which the best recognition rate of 82.1% was obtained using the SSD descriptors.2.A visual feature-based classification method.The audio image algorithm was used to visualize the audio frequency signal of birdsong into four different visual images,including speech spectrograms,harmonic and impact maps,and scattergrams.Six traditional artificial features were extracted from these audio-generated images.These visual features were also used to train and test the SVM classifier for comparison.The results show that the artificial visual features overwhelmingly perform better than the acoustic feature approach.Among them,multiscale local phase quantization(MLPQ)obtained the highest recognition rate of88.0%.3.A deep learning-based classification method.The audio-converted visual images are fed directly into a convolutional neural network,which automatically learns the feature representation.A migration learning approach was used to migrate the feature extraction layer of a convolutional neural network(CNN)model pre-trained on the Image Net dataset to the bird dataset classification task.Six different CNN architectures were tested and compared.The results show that the use of migration learning enables robust audio classification.Among them,the Xception network had the best classification result of 90.6%.4.Based on the multi-model fusion classification method.To further improve the recognition accuracy,the outputs of different visual and acoustic descriptors were first combined to improve the recognition rate.The results show that the integration of acoustic features and visual features separately obtains similar performance,but their fusion significantly improves the accuracy.Secondly combining different fine-tuned CNNs was found to be effective in improving model recognition.Finally combining the CNN set with the artificial features obtained from the spectrogram as well as acoustic features was found to obtain the best recognition accuracy of 98.0%.The results indicate that the best classification performance is achieved by the late fusion of acoustic feature approach,visual feature approach and deep learning.
Keywords/Search Tags:deep learning, bird species recognition, acoustic features, visual features, multi-model fusion
PDF Full Text Request
Related items