Font Size: a A A

Research And Implementation Of Fixed Audio Retrieval Technology In Different Application Scenarios

Posted on:2021-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:W B ZhaoFull Text:PDF
GTID:2518306470968889Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the intelligent era,audio signal processing technology has attracted more and more attention.In life,the real sound scene contains rich and useful information.When there is a potential danger in an environment,the key features of abnormal sound in the scene can be identified and retrieved to achieve the effect of dangerous sound environment monitoring.In addition,when people need to find a target audio file from a large number of audio files,they can also use the key features of audio for retrieval and recognition,so as to improve the efficiency.Both the abnormal sound retrieval system and the normal sound retrieval system need to extract the audio features and establish the audio feature library.Aiming at the problem of audio retrieval in different application scenarios,this topic is divided into two parts.One is to improve the performance of the abnormal audio retrieval system by improving the parameters of the abnormal audio characteristics;the other is to improve the retrieval speed of audio by reducing the amount of data in the audio characteristics database.The detailed research work of this project is as follows:(1)Aiming at the research of dangerous sound scene detection,an abnormal audio retrieval method based on Mel frequency cepstrum coefficient(MFCC)and vector quantization(VQ)is proposed.Firstly,an abnormal audio database is created from the Internet,Warner sound database and Sonny sound database.The database is composed of explosion,scream,emergency brake,cry,alarm,gunshot,fall,broken glass and vehicle whistle.Secondly,the time-domain and frequency-domain characteristics of the abnormal sound are analyzed.According to the characteristics of the abnormal sound in the time-frequency domain,the Mel-frequency cepstrum coefficient is chosen as the characteristic parameter for abnormal sound detection.Through the analysis,we know that the first MFCC coefficient of the audio signal corresponds to the low-frequency band information of the signal,and the dimension coefficient contains relatively less useful information,which contributes little to the retrieval accuracy.Therefore,in this paper,the first dimension information of MFCC parameters is discarded and replaced by short-term energy(STE)in time domain,which is combined into MFCC-STE.Thirdly,by comparing the advantages and disadvantages of all kinds of recognition model classifiers,the vector quantizer is selected as the abnormal voice recognition classifier and MFCC-STE is used as theabnormal voice feature parameter to realize the abnormal audio retrieval classification.At last,the simulation results show that MFCC-STE has a good effect on improving the performance of abnormal audio retrieval system.It shows that the method proposed in this paper has certain research significance and practical value for the abnormal sound retrieval system.(2)In order to solve the problem of large amount of data and slow retrieval speed in the existing audio retrieval,this paper proposes a fixed audio retrieval method based on compressed sensing and audio fingerprint dimensionality reduction.In the training stage of audio retrieval,firstly,the sample audio signal is sparse processed,and the sparse audio data is compressed by the compressed sensing algorithm;secondly,the audio fingerprint of the compressed signal is extracted;thirdly,we introduce the audio fingerprint discrete Gini coefficient,and reduce the dimension of the fingerprint by calculating the discrete Gini coefficient of each dimension of the audio fingerprint,and finally get the retrieval feature library.In the audio retrieval stage,we use the same algorithm as in the training stage to extract the features of the audio to be tested,and match with the audio feature database data to get the retrieval conclusion.The experimental results show that the proposed audio retrieval method greatly reduces the storage capacity of sample audio database and improves the audio retrieval speed on the basis of ensuring better retrieval accuracy.
Keywords/Search Tags:audio retrieval, fusion features, discrete Gini coefficient, audio fingerprin
PDF Full Text Request
Related items