Font Size: a A A

Method Of Chinese Short Text Unknown Words Discovery And Sentiment Analysis

Posted on:2018-09-04Degree:MasterType:Thesis
Country:ChinaCandidate:S Q ChenFull Text:PDF
GTID:2348330563952436Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Internet technology,many instant messaging tools such as micro-blog and WeChat appear.Because they are simple and easy to use,they are becoming an important communication tool for the public in their daily life.So for the short text sentiment analysis will be significant for tracking product reputation and monitoring public opinion.Short text has the characteristics of feature sparse,non-standard expression and so on.This paper is devoted to shorten text sentiment analysis.Based on traditional methods,we have proposed many improved methods.We introduce Unknown Words discovery,Adaboost Weak Classification Algorithm Enhancements,Deep Belief Network feature selection optimization and other improvements to improve sentiment analysis methods.Different from the ordinary text,short text due to the short content,few features,traditional sentiment analysis methods cannot fully adapt to the short text.This paper is based on the study of existing feature extraction and sentiment analysis methods to explore many aspects of research and improvement.(1)Construction and optimization of dictionary resources and explore the rulebased sentiment analysis.This part is mainly through the method discovery Unknown Words based on probability statistics and Conditional Random Field,Marking emotional tendency for Unknown Words based on Mutual Information and then using syntactic analysis and semantic similarity to explore the field of emotional words,realizing the automatic expansion and optimization of emotional dictionary..(2)Sentiment analysis using machine learning.This part introduces the machine learning method to achieve the short text sentiment analysis.Using Word2 vec to realize the filtering of noise words and extension of eigenvector.Comparison of different feature selection algorithms,training classifier to improve classification accuracy.Finally,we explore the Adaboost weak learning classification algorithm.(3)Research on feature selection optimization and emotion classification based on Deep Belief Network.This part introduces the DBN to realize feature selection.Make full use of the characteristics of micro-blog,use the context information to extend the original text such as comments and forwards,perfect semantic abundance and characteristic density.Finally compared with COAE2015 results,proving the effectiveness of our method.This paper aims to explore the Unknown Words auto searching and emotional dictionary optimizing through a variety of methods for short text improvement.It is an effective way to improve the performance of sentiment analysis.
Keywords/Search Tags:Sentiment Analysis, Unknown Words Discovery, Feature Extension, Deep Belief Network
PDF Full Text Request
Related items