Font Size: a A A

Research On Contrast Viewpoint Summarization For Opinionated Text

Posted on:2014-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:X LiangFull Text:PDF
GTID:2248330395467820Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With large amount of information involved, opinionated texts have become the important data source for further data analysis. However, due to the exploding of online opinionated texts as well as the information redundancy within the opinionated texts, it’s difficult to take good use of these opinionated texts and get useful information from them. Therefore, summarizing contrastive viewpoints in opinionated text has emerged as an important research issue.There are six steps involved in contrastive viewpoints summarization of opinionated texts, which are the preprocessing of input data, the calculating of the topic and sentiment attribute of opinionated texts, the topic-based classification of opinionated texts, the calculating of opinionated texts’centrality score and the generation of contrastive viewpoints summarization.LDA applies to the topic analysis of texts, while TAM is applicable for analyzing both topic and sentiment attributes of texts. In this paper, to calculate the topic and sentiment attributes in opinionated texts, we adopt the TAM model which uses Gibbs sampling to estimate parameters.In this paper, we implement the algorithms of basic LexRank, Comparative LexRank and Biased LexRank, and we propose Topic-sensitive TF-IDF LexRank, Topic-sensitive TF-IDF&Comparative LexRank and Biased&Comparative LexRank, which are also implemented. Considering the multi-topic feature of opinionated texts, the Topic-sensitive TF-IDF LexRank Algorithm improves the calculation of TF-IDF, making it sensitive to the attribute of topic. The Topic-sensitive TF-IDF&Comparative LexRank and the Biased&Comparative LexRank use the topic and sentiment attributes of opinionated texts to make it sensitive to topic and sentiment. The Comparative LexRank and the Biased LexRank are sensitive to sentiment and topics respectively.Experiments show that the best performance of TAM-TCLR Summarization Algorithm appears when using Topic-sensitive TF-IDF&Comparative LexRank as the centrality score algorithm for opinionated texts, and the contrastive viewpoints summarization generated in this case is the best.
Keywords/Search Tags:Topic Model, Opinionated Text, Contrast Viewpoint Summarization
PDF Full Text Request
Related items