Font Size: a A A

An Indoor Intelligent Monitoring System With Fusion Of Audio And Video

Posted on:2017-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y J GuiFull Text:PDF
GTID:2308330485464019Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development of the economy, there have been qualitative progress in computer technology and signal processing technology. The indoor intelligent monitoring is getting more and more attention. The intelligent monitoring system not only gets rid of the drawback of the single function of the conventional video monitoring but also is able to conduct real-time target monitoring and tracing on the monitored scenes. What’s more, it greatly reduces labor, material consumptions and finance. Therefore, it has gained broad applications in such aspects as industry, transportation, banking and security. However, the monitoring capacities and scope of single cameras are limited, and the increasing of the number of cameras will undoubtedly increase the investment costs. Therefore, this thesis proposes an indoor intelligent monitoring system that fuses the audio and video. On the basis of reviewing the domestic and overseas relevant theses, it conducts supervision and detection regarding the abnormal circumstances which were occurring on the two aspects of audio and video. Specifically, the following research and R&D work is conducted:(1) On the basis of analysis of the abnormal sound characteristics and the models of indoor noise models, it proposes a pretreatment method on the signals acquired by the array, including Sound Endpoint Detection, SED and background noise removal techniques. The experiments demonstrate that under the environment with low signal-to-noise ratio, SED is unable to reach the ideal detection effects. However, when the endpoint detection is conducted on the denoised signals, its accuracy will be greatly elevated. It is proved by the successive experiments that an excellent pretreatment method is not only beneficial to reducing the computation but also facilitates the elevation of the location precision.(2) On the basis of the generation model of the microphone array signals, it generalizes some common time delay estimation technique and conducts in-depth research on several methods featuring excellent timeliness. In the experiments, the εRMSE (Root Mean Square, RMSE) and ηAR (Abnormal Rate, AR) are used to describe the dispersion and abnormalities of the estimation values deviations from the true value. The results indicate that under the environment of different noises and reverberations, the time delay estimation algorithm of the GCC (Generalized Cross Correlation) features the high accuracy and good timeliness. When the SNR is 5dB and the reverberation time is 100ms, the post SED time delay method based on HAPP (Human Auditory Perception Properties) features the most ideal efficacies. The eRMSE and ηAR respectively being 0.5054 and 0.0385, thus reaching excellent time delay estimation accuracy.(3) Based on the relations between the sound source and the spatial locations of the microphone arrays, it introduces the principles and derivation process of several common near field source location techniques. On the basis of the time delay estimation, location experiments are conducted on a great number of sound sources of different locations and distances. The results demonstrate that the εRMSE and ηAR of angles and distance locations are respectively smaller than 0.1 and 0.3, with small estimation errors. Therefore, in general, the algorithm satisfies the basic requirements of the locations in the indoor environment.(4) This thesis proposes an anomaly detection method with the fusion of audio and video. To a certain extent, the method overcomes the fade zone of single video monitoring and combines the audio signals and the video images so as to make comprehensive judgments on the indoor security. Regarding the audio detection, it expounds on the basic principles, parameter estimations and identification methods of the Gaussian Mixture Model and studies the effects of the different Gaussian Mixture orders and the characteristic parameters on the identification rates and time complexity. It is indicated by large amounts of experimental results that when the Gaussian Mixture order is 32, the average identification rate of the abnormal sound detection based on MFCC_E and GMM can reach over 85%, with lower time complexity. In the aspect of video detection, explanations are made on the principle of the moving object detection based on the single Gauss background model, and the results prove the effectiveness of this algorithm in the indoor environment.(5) In combination with the various algorithms studied in the previous chapters, it proposes an indoor intelligent monitoring system with fusion of audio and video, and it is developed under the PC platform using Visual C++6.0. Firstly, the system conducts pretreatments on the signals acquired by the microphone array. Then, after locating the source position, it realizes the real-time steering of the dome. Finally, the detections are made on the circumstances of the monitoring scenes using the anomaly monitoring detection technique so as to determine whether trigger the alarm.Tests are conducted on the software in the real indoor environment and the results indicate that the system reach the ideal effects on the locations and detections of the anomalies.
Keywords/Search Tags:Indoor Intelligent Monitoring, Microphone Array, Time Delay Estimation, Sound Localization, Anomaly Detection
PDF Full Text Request
Related items