Font Size: a A A

Research On Techniques For Free Text Classification

Posted on:2015-12-17Degree:MasterType:Thesis
Country:ChinaCandidate:S L NianFull Text:PDF
GTID:2308330461458654Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, text data showing explosive growth. To enable users easily and quickly find the information they really need, the need for text data classification. Existing text classification method focuses on a limited number of semantics, little change in the form of structured text classification. However, whether it is in the realization of information technology in the traditional enterprise, or in the broad participation of the user to interact with the Internet, there is a lot of "free text." Free text has two notable features, one is free content, that is involved in a very rich semantic text; Another way is freelance writing, which express the same semantic model due to the wording, narrative writing style is very much different. These two features free text, leading to the direct use of existing text classification methods can not get good classification results. This paper studies for the free text classification, made the following innovations:First, free text for free content, resulting in the problem of scarcity of labeled samples need category, the encounter made scarce use a lot of freedom when you mark text classification unlabeled samples. The method is based on semi-supervised active learning techniques that can identify the one hand, the use of active learning for learning the most helpful text, obtained its type annotation, on the other hand, through the use of a large number of semi-supervised learning marked lack of type text to further enhance learning performance.Second, way for freelance writing free text, text semantic representation of the diversity caused, difficult to classify problems in the text surrounding an image of the situation, proposed the use of the image to reduce the freedom of expression of semantic text classification diversity. First, the use of a multi-view image text learning method based on a correlation measure, the text and to identify the relevance of the peripheral image, and the graphic correlation learned multimodal integration into categories in order to effectively use the image information reduce the semantic representation of the diversity, thereby enhancing the effect of text classification.
Keywords/Search Tags:Text Mining, Free Text Classification, Active Semi-Supervised Learning, Multi-Modal Learning
PDF Full Text Request
Related items