Font Size: a A A

Research And Implementation Of Abnormal Audio Monitoring System For Adaptive Scenes

Posted on:2020-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z M WuFull Text:PDF
GTID:2428330620958472Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
As an important channel for people to perceive the external environment,hearing can be used as an important supplement to vision in the situation of sight disorder and unfavorable illumination condition.In the field of video surveillance,audio monitoring can be an important complement to video surveillance.The existing audio monitoring methods need to provide a lot of labeled data for training according to different scenes when they are applied in different scenes.How to automatically build the training data to adapt to different scenes and save labeling costs,becomes a valuable research direction.This paper focuses on a scheme of audio monitoring system adapted to different scenes.The system automatically records and builds background training data for the current scene to distinguish between background and abnormal sounds in the monitoring site.There are two main problems to be solved: one is how to automatically calculate the clustering number of unequal length audio data and cluster quickly;the other is how to solve the problem of uneven distribution,mixed between classes and obvious differences within classes of training data.The related work of this paper includes:(1)A method of continuous audio segment segmentation is proposed to improve the purity of audio segments.A continuous audio segment segmentation method based on the pitch frequency difference is proposed,which describes the difference between audio segments by the pitch frequency difference,and improves the purity of audio segments for subsequent clustering and detection.(2)A distance calculation method is proposed to calculate the distance of unequal audio band.The N-order feature point definition is proposed to describe the audio envelope feature.An audio distance calculation method based on fast alignment of N-order feature points is proposed to calculate the Dynamic Time Warping distance.The audio segment alignment is carried out by N-order feature points,and the Dynamic Time Warping distance is calculated in segments.Experiments show that this method can effectively reduce the time complexity ofdistance calculation.In the calculation of distance between two samples of about 10 seconds in length,the Dynamic Time Warping method of fast alignment can save up to 2.5 seconds compared with the whole-section method.(3)An audio clustering method is improved to improve the clustering purity.In order to improve the clustering purity,the concept of connected distance is proposed,and an audio clustering method based on peak density is improved.By adding the definition of connected distance in the process of clustering,the clustering purity is increased by about 18% and 30%respectively in the two-dimensional sample set and the audio sample set.(4)The classification method of Gaussian Mixture Model-Universal Background Model is introduced.This method alleviates the problem of inter-class mixing and unsatisfactory purity within the class caused by the uneven distribution of background data.The reasonable structure and parameters of the classification model are found through comparative analysis of a large number of experiments.In the two-classification experiment,the precision and recall of abnormal event are respectively 97% and 83%.(5)An abnormal audio monitoring system for adaptive scenes is designed and implemented.The system includes audio event detection,target sample selection,clustering method of background training samples,model training of abnormal events and background events,detection of abnormal audio events.The offline verification experiment of two scenes and the online real-life experiment of one scene are tested.In the offline verification experiment,the overall accuracy and recall of abnormal events are 65% and 83% respectively in the laboratory scene where the background energy is heavy and the event is complex,the overall accuracy and recall of abnormal events are 65% and 83% respectively in the family scene with low background energy.In the online monitoring experiment of the laboratory scene,the precision and recall rate reach respectively 56% and 79%.
Keywords/Search Tags:Audio Monitoring System, Fast Alignment, Connected Distance, Clustering Method of Background Training Samples, Scenes Adaptation
PDF Full Text Request
Related items