Font Size: a A A

Research On Opinion Analysis For Social Media

Posted on:2017-04-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:S F XiongFull Text:PDF
GTID:1368330512454955Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Opinion analysis is the computational study of people's opinions, appraisals, and emotions toward entities, events and their attributes. The research in the field started with sentiment and subjectivity classification, which treated the problem as a text clas-sification problem. Sentiment classification classifies whether an opinionated document or sentence express a positive or negative opinion. Subjectivity classification determines whether a sentence is subjective or objective. Many real-life applications, however, re-quire more detailed analysis because users often want to know the subject of opinions. Therefore, opinion analysis is a complicated problem which consists of several sub tasks. In this dissertation, to solve the problem, some key technologies of opinion analysis for social media are deeply researched. The context of this dissertation mainly include:1. For tackling the text sparse problem in product reviews, we propose a joint sentiment-topic model for short text reviews to tackle this issue. Unlike other topic model which models the generative process of each document, we directly model the generation of the.whole global corpus. During the generation process of the model, each word in a sentence have the same sentiment polarity and each word pair has the same topic. We apply the proposed model to two social media datasets, the experimental results demonstrate that it is effective on topic discovery and sentiment analysis, and it achieves better performance compared with the existing approaches.2. Aspect-phrase grouping is an important task for aspect finding in aspect-level sentiment analysis. Most existing methods for this task are based on a window-context model, which assumes that the same aspect has similar co-occurrence contexts. This model does not always work well in practice. In this dissertation, we develop a novel weighted context representation model based on semantic relevance, which exploits word embedding method to represent aspect-phrase. And we encode the lexical knowledge as constraints with a degree of belief, and further propose a flexible-constrained K-means algorithm to cluster aspect-phrases. Moreover, we also explore a novel idea, capacity limitation, which states that the number of aggregated sentences in an aspect-phrase group has upper bound. And we propose a capacity constrained K-means algorithm to cluster aspect-phrases which encodes the capacity limitation as constraint. Empir-ical evaluation shows that the proposed method outperforms existing state-of-the-art methods.3. Aspect phrase grouping is a challenging problem due to polysemy and context dependency. For tackling this problem, we propose an Attention-based Deep Distance Metric Learning method, by considering aspect phrase representation as well as context representation. First, leveraging the characteristics of the review text, we automati-cally generate aspect phrase sample pairs for distant supervision. Second, we feed word embeddings of aspect phrases and their contexts into an attention-based neural net-work to learn feature representation of contexts. Both aspect phrase embedding and context embedding are used to learn a deep feature subspace, in which the distance between positive sample pairs is more smaller and the distance between negative pairs is more bigger, for measure the distances between aspect phrases for K-means cluster-ing. Experiments on four review datasets show that the proposed method outperforms state-of-the-art strong baseline methods.4. General graph based methods for opinion summarization have some limitations in sentence topical group and content diversity. In this dissertation, we propose a novel hypergraph based vertex-reinforced random walk framework for multi-document summarization. The framework first exploits the Hierarchical Dirichlet Process topic model to learn a word-topic probability distribution in sentences. Then the hypergraph is used to capture both cluster relationship based on the word-topic probability distri-bution and pairwise similarity among sentences. Finally, a time-variant random walk algorithm for hypergraph is developed to rank sentences which ensures sentence diver-sity by vertex-reinforcement in summaries. Experimental results on the public available dataset demonstrate the effectiveness of our framework.5. Sentiment-specific word representation is an effective method for improving Twitter sentiment classification, which encoded both n-gram and distant supervised tweet sentiment information in learning process. They assume all words within a tweet have the same sentiment polarity as the whole tweet, which ignores the word its own sentiment polarity. To address this problem, we propose to learn sentiment-specific word embedding by exploiting both lexicon resource and distant supervised information. We develop a multi-level sentiment-enriched word embedding learning method, which uses parallel asymmetric neural network to model n-gram, word level sentiment and tweet level sentiment in learning process. Experiments on standard benchmarks show our approach outperforms state-of-the-art methods.
Keywords/Search Tags:topic model, opinion analysis, aspect-phrases grouping, word embed- ding learning, summarization, K-Means, review analysis, deep metric learning
PDF Full Text Request
Related items