Font Size: a A A

Hot Topic Mining And Opinion Analysis On BBS

Posted on:2009-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:X N YaoFull Text:PDF
GTID:2178360248455069Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, BBS, also called Web forum, has become an important platform where people express their public opinions freely. In the demands of finding hot topics in time that people concerned during each period and acquiring opinions and attitudes on these topics, the government and Web monitors need a kind of effective and intelligent technology to monitor the public sentiments on BBS.The main content of this paper are as follows.(1) Automatic extraction of BBS information. In order to collect the BBS web pages and extract information from the pages automatically, HTML Parser package and regular expression are used to parse the HTML documents downloaded from BBS. Then the information of post threads is extracted and stored in XML documents.(2) Feature selection and weight computing for BBS texts. Different from traditional texts, BBS texts have their special language and structure. Feature item is evaluated from four aspects such as term frequency, term postion, term length and the number of posts which contain the term. In traditional Vector Space Model, the weight of feature item is computed as TF-IDF formula. In this paper, TF part is replaced with evaluating functions.(3) BBS hot topic mining. The topic detection is a key step in hot topic mining. Single-Pass, K-Means and K-Medoids clustering algorithms are adopted to detect topics, and these models are improved in practice. Then the hotness of topics is scored according to topic information including the number of threads, the number of valuable threads, the replys and the views per hour.(4) Opinion analysis on post threads. In a post thread, feature items are used as the opinion targets. Based on the polar lexicon and dependency parser, the SBV polarity hand-on algorithm is adopted to analysis the opinion sentences in BBS. Moreover, the SBV algorithm is compensated through considering some sentences which have verb-verb relation between subjective-verb relation and verb-object relation. At the end, based on the results of opinion sentences analysis, the opinion of a whole post thread is analyzed. The experimental results verify the efficiency of the opinion analyzing algorithm.
Keywords/Search Tags:Hot Topic Mining, Opinion Analysis, Web Text Mining, Text Clustering, Dependency Parser
PDF Full Text Request
Related items