Font Size: a A A

An Effective Information Representation for Opinion-oriented Applications

Posted on:2013-04-03Degree:Ph.DType:Thesis
University:The Chinese University of Hong Kong (Hong Kong)Candidate:Li, BinyangFull Text:PDF
GTID:2458390008487151Subject:Computer Science
Abstract/Summary:
There is a growing interest for users to express their opinions about products, films, politics, by using on-line tools such as forums, blogs, facebooks, etc. These opinions cannot only help users make decisions, e.g., whether to buy a product, but also to obtain valuable feedback for business and social events. Today, research on opinion- oriented applications (OOAs) including opinion retrieval, opinion summarization and opinion question and answering is attracting much attention. The difference between fact-based and opinion-oriented applications lies in users‘ information need. The former requires objective information and the latter subjective, which comprises of opinions or comments expressed on a specific target. To meet the need of subjective information, both opinionatedness and relevance together with the association between them should be taken into account. Existing systems represent documents in bag-of-word. However, this representation fails to distinguish opinionatedness from relevance. Moreover, due to the ignorance of word sequence, words associations are lost. For this reason, bag-of-word representation is ineffective for subjective information, and affects the performance of OOAs seriously.;In this thesis, we try to answer the following challenging questions arose in subjective information representation. · Since word is no longer the basic semantic unit, how would subjective information be represented? · Subjective information is a combination of opinionatedness and relevance, so how would the association between them be modeled? · How would subjective information be measured for the purpose of document ranking, retrieval, and analysis? · How would opinion-oriented applications benefit from subjective information? We start from solving the problem of opinion retrieval whose results can directly influence the performance of other opinion-oriented applications. We first present a sentence-based approach to analyze the limitation of bag-of-word representation and define a semantically richer representation, namely word pair for subjective information. A word pair is constructed by a sentiment word and its associated target co-occurring in a sentence. We then propose techniques to capture two kinds of contextual information. 1) Intra-opinion information: three methods are proposed to extract the word pair. 2) Inter-opinion information: a weighting scheme is present to measure the weight of individual word pair. Finally, we devise an algorithm to integrate both intra-opinion and inter-opinion information into a latent sentimental association model for opinion retrieval. The evaluation on three benchmark datasets suggests the effectiveness of word pair and the latent sentimental association retrieval model provide insight into the words association to support opinion retrieval beneficial from pairwise representation. We also apply word pair to opinion summarization and opinion question answering. The evaluation on two benchmark datasets shows that word pair performs effectively in the applications.
Keywords/Search Tags:Opinion, Information, Word pair, Applications, Representation
Related items