Font Size: a A A

Design And Implementation Of Audio Data Collection System

Posted on:2019-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:C Y GuoFull Text:PDF
GTID:2348330545455734Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Audio data collection system is implemented through web crawler and real-time recording to complete the audio data acquisition,and then the collected audio data are automatically labeled into speech,environment sound,music and other audio categories by using signal processing and pattern recognition technologies.Audio data collection system consists of three subsystems,which are respectively:network audio data crawler subsystem,audio real-time recording subsystem and audio data classification subsystem.The main work of this paper is as follows.(1)Design and implement the network audio data crawler subsystem for specific media data.The system's crawling target are Beijing Broadcasting Network's various audio broadcast programs and the YouTube large-scale audio data set-AudioSet,the AuidoSet is mainly used in the training stage of the audio data classification subsystem,then the system uses the trained classification model to classify the three audio types of speech,environment sound and music of all kinds of audio broadcasting programs captured from the Beijing Broadcasting Network.Finally,the tagged data is used for the speaker recognition system and other semantic level recognition systems.(2)Design and implement the audio online recording subsystem and achieve the detection function of basic audio events.The system uses multiple processes to realize the audio recording function based on the advanced Linux sound architecture(ALSA),and realizes the real-time detection function of basic audio events based on audio features such as short-term energy,spectral flatness and peak valley difference while recording audio at the same time,in other words,the system realizes the real-time tagging of basic audio events while in the real-time recording.(3)Implement and optimize the audio data classification subsystem.The system firstly carries on the performance testing and analysis for the non-speech classification section of the content system based on characteristics of long-time audio segments,and then the system relies on the non-speech labels provided by the DNN-based speech/non-speech detection system to extract new features and use a new classification model to classify environment sound and music,and finally achieves the optimization of classification performance of the three audio categories of speech,environment sound and music.The optimization scheme mainly includes the expansion of training data set,the addition of MFCC and VQT features,the use of classification models such as decision trees and random forests to improve the discrimination rules of audio data classification of environment sound/music type,and ultimately improve the performance and recognition accuracy of the system classification model.
Keywords/Search Tags:audio data classification, random forest, alsa, web crawler
PDF Full Text Request
Related items