| Along with the ever-growing web, online videos are popularized in every family. There are many harmful videos on Internet, such as horror, violent and pornographic videos. These harmful videos are called specified sensitive videos, which is harmful for the physical and mental health of persons, especially teenagers. Therefore, many researchers have paid a lot of attentions to recognition of sensitive videos. However, their works have some shortcomings, which lead to the low recognition rates. There are two reasons:Firstly some researchers ignore the role of sensitive information in extraction of key frame. They ignore important features such as emotion, motion and private body parts. Secondly, the existing recognition methods for specified sensitive videos either ignore the multiple and contextual information among the shots, or ignore the dependent relationship among the multiple features. Because the recognition of sensitive videos is very important, the thesis mainly focuses on the researches of this area.The thesis contains two aspects:one is structure analysis and feature representation, the other is context construction and dependent model. The author proposes effective algorithms to solve problems above. Experiments on a specified sensitive video dataset demonstrate that the performance of our method is superior to the other existing algorithms.In the analysis of video structure, the thesis chooses a shot boundary detection algorithm based on information theory and a combination of sensitive and general key frame, which improves the accuracy of feature selection. In the representation of video feature, the thesis describes the horror video by features of color and emotion, the violent video by features of motion and violent elements, and the pornographic video by features of skin color and private body parts. These three features can represent the inherent property of sensitive videos and improve the effect of sensitive semantic analysis.In the extraction of contextual information, a multiple contextual structure graphs are constructed for the shots in one video clip. The thesis proposes a classifier algorithm based on multiple and contextual structure. The algorithm simulates the inner structure of video to help recognition of video, which can improve performance of classification. In the fusion of dependent model, a dependent relationship of diverse feature space is quantized. The thesis proposes a fusion algorithm based on linear dependent model. The algorithm compares a priori and a posteriori probability and solves linear dependent model to obtain weights of features. Weights are used for the fusion of classification results, which also can improve performance of classification. |