Font Size: a A A

Sentiment Analysis On Entity Search Results

Posted on:2013-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:S HuFull Text:PDF
GTID:2298330392467972Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet communities like forums, more andmore users have participated in and contributed data to the Internet. A large part ofthese data include commentary on personalities and events. Users can know aboutthe public’s opinions on things they are interested in with viewing theseinformations. It is hard to collect and process the emotion information on theInternet by hand because they are mass. Search engines are the main shortcut ofobtaining information, but they only pay attention to documents that are relevant tothe query and ignore emotion informations in the documents. So combine thesentiment analysis and search engine techniques in this paper. When the query is anentity, we classify the sentiment of a sentence that contains the entity into one ofthe three classes: positive, negative or objective. The analytical results can be usedto support other tasks such as sentiment retrieval, information filtering and is ofgreate practical value. The types of entities that we focus on in this paper includedigital product, person, organization and policy.First, this paper proposes a SVM based method to solve the relevant sentencerecognition problem, and raises several features such as dependency paths fromentity to polarity words. With this method, relevant sentences that talk about theentity are recognized and the proportion of relevant sentences rises from77.5%to85.85%.Secondly, this paper proposes a context-expansion based sentence domainrecognization method which treats a sentence and its previous and following2sentences as a whole. The whole sentences are used to represent the sentence thatcontains the entity and are classified. This method expands the content of thesentence that contains the entity and overcomes the data sparseness to some extent.Compared to the method that classify the sentence that contains the entity directly,this method improves significantly the classification accuracy, however theperformances of policy and organization are poor. By analysis, we find that this iscaused by the high level of similarity between the features of policy andorganization segments.Finally, the sentences that contain the entity are classified by their sentiment.They are classified into one of the trhee categories: positive, negative or neutral. Weadopt a SVM base approach and use two kinds of features: polarity words andunigrams. The experiment results show that the performance of the approach withboth kinds of features is better than the performance of approaches with only one ofthe two kinds of features. And unigrams are more effective than polarity words. Inthe mean time, we analyze the performances of approaches with different featuresizes, and find that with the increment of feature size, classification accuracy soon reaches the peak value. This indicates that feature selection is essential for websentence sentiment classification.
Keywords/Search Tags:Information Retrieval, Sentiment Analysis, Entity Retrieval, Sentence Domain Recognization, Sentence Level Sentiment Classification
PDF Full Text Request
Related items