Font Size: a A A

Research On Model Of Hot Topic Opinion Mining In Virtual Communities

Posted on:2010-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:L MaiFull Text:PDF
GTID:2178360302959938Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
While the Internet is developing rapidly and a lot of Web2.0 Applications become popular, the users are more and more important to Internet. User Generated Content (UGC) is the most actively, concerned and valued resource on the Web. The UGC come from the real world and reflect the really think of the users. Because virtual communities contain the greatest number of UGC, it's meaningful to study virtual communities and to mine the UGC.Our work and innovations are as follows:The dissertation studies the features, the structure and the content organization of virtual communities. The dissertation also differentiates the defenition of Subject and Topic, studies the component, the structure and the features of topics. And the dissertation proposes constructing the tree structure of the topic by the reply-relation.The dissertation studies the cause and the features of the phenomenon of"Topic Drift", and proposes the concept of Topic Relevancy to detect"Topic Drift". Because of the absence of standardization of UGC text, the performance of traditional algorithms based on text similarity is poor. The dissertation proposes a novel method which can computer the topic relevancy by the structure information of topics. And the method achieves good result in practice.The dissertation studies the multi features of Web documents, and evaluates the importance of different features. The dissertation proposes a novel text classification method which makes full use of different features of Web documents. The method is based on Na?ve Bayes Classification. The method is applied to Blogpost Classification and achieves good result in practice.Based on the work above, the dissertation proposes the topic extraction method, hotspot evaluation method and opinion mining method in virtual communities. And the three methods compose the hot topic and opinion mining model in virtual communities. The topic extraction method combines the classification and the clustering algorithm. The hotspot evaluation method evaluates the hot degree of topics by the attention rate, relevancy and timeliness. The topic opinion mining method gets the overall opinion on the topic by analyzing the subjectivity, opinion polarity and opinion object of each post in the topic. The topic extraction method achieves high precision in practice. The result of hotspot evaluation agrees with the practical situation. And the result of opinion mining can reflect the overall opinion of the users. So the hot topic and opinion mining model in virtual communities proposed by us is effective and makes sense.
Keywords/Search Tags:Virtual Community, Topic Extraction, Hotspot Evaluation, Topic Opinion Mining, Topic Relevancy Algorithm Based on Structure Information, Multi-Feature Fusion Classification
PDF Full Text Request
Related items