Font Size: a A A

Research Of Violence Audio Fragment Detection Based On Tensor Model

Posted on:2016-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:J X LiangFull Text:PDF
GTID:2308330479490036Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet, it is more and more convenient for people to share multimedia on the Internet. Inevitably, there will be some violent multimedia information spread on the network. For minors and other particular groups of people,the information will have a serious negative impact on their behaviors. The efficiency of relying solely on manual auditing is very low, so it is necessary to propose a method to detect the violence and prevent its spread.Auditory is an important channel for people to get the multimedia information.In the previous research, the auditory channel is often used as a subsidiary method of audio-visual channels. The restrictions of violent audio recognition will further affect the detection effect of the video on the whole. Therefore, this paper mainly studies the method of detecting the violent content in the channel of auditory.This paper mainly studies the method of detection violent audio based on tensor model. Firstly, the effective feature set of detecting the violence information is extracted to construct the feature tensor of each class. Then, by decomposing the tensor of each category, the projection subspace is constructed. The original feature can be converted into a low dimensional feature vector by projection. In this way, the feature dimensional is greatly reduced while retaining the inherent structure information. Finally, we propose a violent audio classification method using the projection features. Gaussian models are built for different categories of audio features, and the final prediction is decided by Minimum Bayes Risk decision.In this paper, we use the data set from the audio database that the Media Eval2013 workshop provided. The experimental result show that, compared with the traditional feature set, the proposed method has a higher recall and F1 value, but a relatively lower precision. In order to improve the recognition rate and remedy the disadvantage of low precision, a method of fusing long-term features and short-time features is proposed. Experimental results show that compared with the traditional method, the recall, precision and F1 are all improved.
Keywords/Search Tags:violence audio recognition, tensor model, subspace, classifier fusion
PDF Full Text Request
Related items