Font Size: a A A

Research On Key Techniques Of Opinion Mining For Entity

Posted on:2015-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhangFull Text:PDF
GTID:2268330428472981Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, more and more users publish their own assessment information on a product or event to the internet. This information has very important value. They can set the correct measures for the government services and provide guidance information for the majority of businesses and consumers. Simultaneously, the data on the Internet grow at an exponential rate, it would be time-consuming and labor-intensivd to obtain valuable information for the product review, therefore how to obtain useful information automatically by using computer programs becomes very important.At present, the opnion mining mainly contains three levels:text level, sentence level and feature level. Text level opinion mining assumes that one text only be related to one product. Sentence level includes two contents, one is to identify the subjective and objective sentences, the other is the sentence tendentiousness analysis, this also assumes that one sentence contains only one single point of view; feature level contains three aspects of content:1) Identify and extract the characteristics of entity,2) Determine emotional tendencies with the entity characteristics(3) Provide multi-angle view summary based on the features. However, text in internet often contains more than one entity, it usually includes multiple relevent description of relevant entity. So this paper use entity as the basic unit and focuses on the entity opinion on the sina blog. The research work and innovation are as follows:(1) This paper proposes an Entity Topic Model(ETM) to extracte entities and their corresponding evaluation words according to the distribution. This Model extends the Latent Dirichlet Allocation by adding an entity layer between the document and topic layer. It represents each entity with a mixture of topic, each topic is associated with a multinomial distribution over words. The basic idea is to select the entity tag according to the authors commented entities when writing blog, then use the labels to guide the generation process of text words. ETM assigns the words related to one entity to one topic, and digs out the latent semantic relationships between entities, topics, and words.(2) Using mutual information to secondary extraction words related to entities, mutual information measures the correlation between two variables, the entity and its corresponding evaluation words tend to appear in the same text, this method excluds the word related to evaluation words of the same topic but unrelated to the entity.(3) Puts forward a method to construct the context-free emotions dictionary based on the meaning of word, using association rules to extracte fixed collocation of words from the corpus, and assess the emotional tendencies of word combinations to construct contextual emotional dictionary. Finally, we obtain an intuitive orientation analysis of entities according to the emotional dictionary and the analyzing result of words related to it.
Keywords/Search Tags:Opinion Mining, Topic Mpdel, Association Rules, Mutual Information, Emotional dictionary
PDF Full Text Request
Related items