Font Size: a A A

Research Of Hot-Topic-Oriented Subjective And Objected Classification Method For Microblog Text

Posted on:2014-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:2268330401462382Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Web2.0, the carrier of Internet is not just a single computer, but the mobile phones, tablet PCs, other mobile terminal has quietly entered the people’s vision. People not only get a message and share information through community and blog, but also can sharing a massage instantly in microblog no matter where you are. The substantial growth of microblog user attracts a large number of scholars to study the published microblog texts, and hot-topic-oriented subjective and objected classification for microblog text is one of the most important problems. So far, scholars mainly study the microblog text without a topic, the study for topic related microblog text is still in its infancy.Hot-topic-oriented microblog text is dispersible, which means the microblog text is usually not related to the current hot topic, this phenomenon can lead to a low rate of precision in subjective and objective classification on hot topic. For the former reason, this paper proposed a hot-topic-oriented subjective and objected classification method for microblog text. This method considered both subjective and objective classification and topic related classification, and built two classification model respectively to solve sub-problems, then used a Logistic regression model to unify the two parallel models’ results to one model.The major research contents and conclusion contains:(1) Topic relevance algorism based on Tongyici Cilin. The mainly research in topic related classification sub-problem is judge whether the microblog text is related to the hot topic or not, and how to measure the degree of relevance between the two is the key point of this issue. In this paper, we calculated the distance between the current word and the hot topic words as the degree of relevance based on Tongyici Cilin extended version in order to simplify the calculation method of topic relevance.(2) Opinion word set generation based on Chinese FrameNet. The mainly research in subjective and objective classification sub-problem is to find a opinion microblog text, and how to built an effective opinion word set is one of the important steps. In this paper, we used the lexical of "Opinion" frame as a seed set to create a opinion word set based on the frame relation and lexical in Chinese FrameNet.(3) Using a logistic regression model to unify the topic related classification model’s result and subjective and objective classification model’s result to obtain the opinion text.(4) This paper set the subjective and objective classification without topic related judgment as Baseline, and contrasted with multi-classes-classification and step-by-step classification. We analyzed the importance of the logistic regression model to parallel integration the two sub-problem models.
Keywords/Search Tags:Hot Topic, Subjective and Objective Classification, TopicRelated Classification, Logistic Regression Model
PDF Full Text Request
Related items