Font Size: a A A

The Research Of Microblogging Text Feature Extending Based On The Concept Algebra

Posted on:2016-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:S P WuFull Text:PDF
GTID:2308330479495444Subject:Computer applications
Abstract/Summary:PDF Full Text Request
In microblogging text processing, since the micro-blog is relatively more short text and traditional text, text microblogging features compared to traditional textual features more sparse. Microblogging text feature sparse led exist in the process of text microblogging many errors and uncertainties. Microblogging text features extended text microblogging purpose is to increase the amount of semantic features, making microblogging text contains more semantic information for microblogging text processing to provide high quality text corpus. Microblogging text feature extensions, how to get more accurate enough semantic information and semantic information obtained from whence become important feature extension work.For the text processing problems of text feature sparse in microblogging, this paper research the method of extending text feature based on concept algrebra for microblogging and microblogging text classification method. In microblogging text feature extension study, with wikipedia as a semantic knowledge base, using he structural characteristics of concept algebra, the micro-blog text was organized for representation after text features expanded. The extension of microblogging text feature includes microblogging text preprocessing and text features of semantic extension. For the problem of text too short and inaccurate terms, this paper based on analysis of the micro-blog text structure and characteristics of Wikipedia, complete increasing text length meaning correction with microblogging forwarding information and wikipedia redirection page. In text feature extending, concept correlation calculation methods was designed base on Wikipedia category network to select the Wikipedia explanation of information as microblogging expansion characteristics. Microblogging text classification, improved concept algebra concept similarity computing approach, the concept of similarity in the calculation, the introduction of the concept of the relationship between input and output, as well as adding text similarity density function, the realization of micro-blog related text classification the study.In this paper, according to the text microblogging microblogging feature extension methods and text classification method, and finally through the experiment, they were related to verification and analysis. According to the result of comparison with D.Milne and IHWitten experimental correlation calculation method, getting a better accuracy, it is proved the feasibility of this microblogging feature extension methods. According to the feature and did not extend microblogging classification results compared to obtain a better classification results, indicating that the microblogging text expansion does to a certain extent, solve the microblogging text feature sparse problems, and then confirmed the method of text similar calculation is feasibility.
Keywords/Search Tags:Features Extended, Concept Algebra, Correlation Calculation, Text Classification, Similarity Calculation
PDF Full Text Request
Related items