Font Size: a A A

Research On Topic-sentiment Mining From Multiple Text Collections

Posted on:2016-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:Q F ZhuFull Text:PDF
GTID:2308330476953324Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
This paper presents the research on topic-sentiment mining from multiple text collections. Text collections are represented with a mixture of components and modeled via the hierarchical Dirichlet process, which can determine the number of components automatically. Each component consists of topic words and its sentiments. The model presented in this paper extends the topic sentiment mixture and can mine topics with different proportions and sentimental proportions as well as one positive and one negative word distribution for each collection. The model is implemented using Markov chain Monte Carlo method and experiments show that it can find meaningful topics and their sentiments. The model is flexible, better than joint sentiment topic or multifaceted models on parameter settings and experiments on Multi-Domain Sentiment Dataset show that it has the ability to analyze the sentiments. In the iterative experiments on Chinese event-based microblog and news texts, we analyze the parameters which control the topic proportion similarities among these collections and discover some differences from microblog to news media. The event contents are more detail in news while in microblogs, there are more discussions, e.g., news media reported the “Occupy Wall Street” in detail and expressed negative sentiment, while microblog users cared about China-related issues which caused the sentiment being mixed. Chinese News reports are more keen on reporting their own country’s efforts if the country is involved in, e.g., in the reports of the event “Missing MH370”.
Keywords/Search Tags:topic modeling, hierarchical Dirichlet process, sentiment analysis, text mining, comparative analysis
PDF Full Text Request
Related items