Font Size: a A A

Automatic Pornographic Video Recognition Based On Audio

Posted on:2012-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:P Y JiFull Text:PDF
GTID:2178330335960052Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
In recent years, digital video has been widely used in many important areas. The widespread dissemination of digital video facilitates people's lives, however some problems have emerged. Violence, pornography and other indecent content of video can also take the opportunity to spread, which will be the disharmonious factor for the society.Based on this reason, automatic video content analysis and detection effectively can be a meaningful work. However, a much lager storage space and higher computing speed might be needed when we process video data directly since the amount of which is large. At the same time, it is still difficult to automatically extract high-level semantic structure for general video stream. Therefore, we should find video content analysis solution through a variety of ways. As we know, audio, which is also a time-dependent media, is an important addition in the video file. It can supplement visual information and lead a good way to video content analysis and recognition. In this paper, we have tried to make use of associated audio information to help video content recognition.General speaking, there is no difference between the audio in pornographic video and other type audio in physical property, so traditional audio processing method can be effective. Based on this assumption, GMM model and the HMM model are adopted to realize audio recognition process.The proposed solution consists of the following main stages. First of all, audio information is separated from video stream and converted to WAVE format (16bit,22 kHz, mono). Secondly, continuous digital audio signal is spitted into fixed length audio frames by Hamming window. In the next step, a 36-dimensional feature vector is computed for each frame, which includes 26-dimensional MFCC coefficient,1-dimensiona zero crossing rate,1-dimensional short-time energy,4-dimensional sub-band energy and 4-dimensional sub-band energy ratio. Each frame is first classified into silence and non-silence frames according to short-time energy and only non-silence frame will be further processed. Ten successive non-silence frames are then further classified for music, speech, music & speech (mixture of music and speech) and environment sounds by pre-trained GMM model. Finally, music and music & speech frames are fed into classifiers constructed by pre-trained HMM models, and it will give the final results indicating whether current frame is a pornographic frame.The algorithm mentioned above has been realized in VC6.0, and the test result shows the system can do help to pornographic video recognition.
Keywords/Search Tags:Pornographic Video, Audio, GMM, HMM
PDF Full Text Request
Related items