Font Size: a A A

Research On Sentimental Lexicon Construction For Text Sentiment Analysis

Posted on:2011-01-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:W F DuFull Text:PDF
GTID:1118360332456383Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of computer network and information technologies, In-ternet has become a very significant media and web documents have become an indis-pensable information source for people's daily living. Especially, in the web2.0 era, In-ternet turns from the static media to the dynamic platform, on which people can expressand exchange their personal experiences and opinions about almost everything: news re-views, product review and Blog, etc. Those user generated content contains very valuableinformation. How to mine those opinions automatically and efficiently will hence be avery challenging question, as well as be promising in applications and development ofEnterprise Business Intelligence and Public Opinion Survey and etc. Sentiment analy-sis (opinion mining) aims to automatically identify the emotions, including opinions andattitudes in texts. For the research of sentiment analysis, it is a fundamental and impor-tant task to construct sentiment lexicon. Therefore, this paper focuses on the problemhow to construct sentiment lexicon automatically. The research on the sentiment lexi-con construction in this paper has great significances on the development and applicationof sentiment analysis technologies. The main research topic of this paper includes thefollowing aspects:The first part of this dissertation introduces the research background of sentimentanalysis and sentiment lexicon construction, analyzes the challenge faced by the con-struction of sentiment lexicon and make clear the significance of the research on thispoint. Then it reviews the research state of sentiment analysis and sentiment lexicon con-struction.To reduce the dependency of sentiment lexicon construction algorithm on paradigmwords, the dissertation proposes a function optimization based general sentiment lexi-con construction algorithm. Most of existing methods only utilize local information be-tween the unlabeled words and the paradigm words. To this end, we proposed a func-tion optimization based general sentiment lexicon construction algorithm from the graph-partitioning point of view, and solve it by using simulated annealing algorithm. The ex-perimental result proves that proposed framework is reasonable and the solution is valid.To improve the capability of avoiding falling into local minimum of graph-partitioning based sentiment lexicon construction algorithms, the modularity optimization mechanismis considered. The essential idea of graph-partitioning based approaches is'minimumcut'that is to look for divisions of the vertices into two subgroups so as to minimize thenumber of edges running between the subgroups. However, these methods are inclined toput all of the vertices in one of the two subgroups and none in the other. To address thisproblem, this dissertation proposes a modularity optimization based general sentimentlexicon construction algorithm.To address the problem of domain transferring of sentiment lexicon, the dissertationproposes an information bottleneck based domain-oriented sentiment lexicon constructionalgorithm. Most of existing construction approaches take only the kind of relationshipsbetween words into account, which makes them have a lot of room for improvement. Thispaper proposes an adapted information bottleneck method for the construction of domain-oriented sentiment lexicon. This approach can naturally make full use of the mutualreinforcement between documents and words by fusing three kinds of relationships eitherfrom words to documents or from words to words; either homogeneous or heterogeneous;either within-domain or cross-domain.Finally, a fine-grained product review mining system is designed and implementedusing the iterative reinforcement techniques. In this system, product features and opinionwords are clustered and associated simultaneously and iteratively by fusing both theirsemantic information and co-occurrence information. Furthermore,we implement thefeature-based product recommender system in the use of sentiment lexicon.
Keywords/Search Tags:Sentiment analysis, Sentiment lexicon, Function optimization, Modularity, Information bottleneck, Iterative reinforcement
PDF Full Text Request
Related items