Font Size: a A A

Internet Text Video Filtering Technology Research And Applications

Posted on:2011-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:L X XuFull Text:PDF
GTID:2208330332977293Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Due to the lack of supervision mechanism, more and more harmful information appears oninternet. In orderto construct clean interneten vironment for young users, automatic removal of harmful informationis very important. More and more harmful text informationis promulgated in images for avoiding being ?ltered by current text ?ltering systems, which makes"textimage"?ltering anew challenge in information ?ltering. Our research is focused on ?ltering harmful information carried by text images on the internet, which includes two aspects as char recognition in text image and text ?ltering. To improvethe precision oftextimage ?ltering, we propose a few of constructive methods for the key technologies in both char recognition in text image and text ?ltering. Text region locating and text extraction methods are proposed to improve the precision of char recognition in complex background images.(1) Text topic identi?cation and semantic orientation analysis are presented for text ?ltering. Concretely, following contributions are involved in this thesis: A connected component based approachis provided for text locating. The presented algorithm takes advantage of character's geometrical shape features and collectivity features of characters in the text regions, and integrates these features into a classi?cation process. Meanwhile the cascade of threshold classi?ers and support vector machine are combined in this approach to recognize characters. Experimental results demonstrate that the proposed algorithm brings high precisiontotextlocating.(2) For text extraction problem in complex background images, a color segmentation algorithm based on HSL colorspaceis proposed to reduce the influence from difierent character color and the complex background. The algorithm categorizes text regions into three color types, then segment text region of difierent types with difierent HSL component.In thisalgorithm, the strength of each HSL componentisutilized efiectively. Experimental results demonstrate the efiectiveness of the proposed algorithm.(3)In text ?ltering, text topic is used to constructusers'pro?le, soastoidentify the input text should be ?ltered or not. A method based on concept knowledge tree for text topic identi?cationis presented. The method uses the semantic relation of concepts to identify the key concept of the input text, and constructs compound concept to express the text topic. Experimental results show the presented topic identi?cation method has an encouraging performance and is applied to text ?ltering.(4)A method based on thecontextoftopic word forsemantic orientation analyzing is proposed to identify the negative and positive attitude appeared in the same topic text. In this method, we suppose that text semantic orientation is concerned with text topic, and text semantic orientation can be expressed by the relation between topic word and its context. The proposed method removes the effect of text topic by varity. Experiments are carried out and show the influence of text topic variation is effectively suppressed by our algorithm.
Keywords/Search Tags:Information ?ltering, Textregion detection, Text extraction, Topic identi?cation, Semantic orientation analysis
PDF Full Text Request
Related items