Font Size: a A A

Research On Uyghur Sentiment Analysis

Posted on:2018-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:E X T T E G YiFull Text:PDF
GTID:2348330533456503Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,more and more users take the initiative to join the Internet to create a lot of data,such as microblogging data,forum data,e-commerce site comments and other data.One of the commonality of these data is that the data that the user initiates is mostly based on the attitude and emotion of the user for a particular social problem or a product.Analysis of these data with the user's emotions not only have economic value,but also social value.For example,the e-commerce site can analyze the user's review data to understand the needs of users and product performance,thereby improving product quality and service levels.Government departments can quickly understand the social hot issues and the attitude of Internet users through the analysis of Internet users comment data,so as to achieve effective public opinion analysis and public opinion monitoring.The Uyghur sentiment analysis study came into being in this environment.Based on the previous work,this paper creates the Uygur language sentiment corpus,and uses the traditional machine learning algorithm and neural network algorithm on the basis of this corpus to find out the research method which is most suitable for Uygur sentiment analysis.Although the previous scholars have carried out some research on the sentiment analysis of Uygur language,most of them have studied from a certain point of view,and did not make a comparative experiment on the whole process of Uygur sentiment analysis.In this paper,we use many tools in the Python language to study the sentiment analysis of Uyghur.These tools are Numpy,Pandas,Scipy,Matplotlib,BeautifulSoup,Tkinter,Scikit-learn,Gensim,Keras and so on.This paper examines each process of sentiment analysis from the formulation of the annotations of sentiment corpus to the testing and use of sentiment classifiers,thus determining the effect of each process on the final outcome and find best way in a process by comparing the experiments.The development of the annotation specification of the emotional corpus is the beginning and the foundation of the whole process.The quality of the corpus directly affects the quality of the emotional corpus,thus affects the final effect.This article has developed eight large sentiment categories and 25 meticulous sentiment categories,so that the sentiment corpus is not only practical,but also scalable.The simplest feature selection method which is based on document frequency has the best performance in the feature selection phase.In the feature weight calculation phase,the feature weight calculation method based on TF-IDF is the best.In the traditional machine learning algorithm,the performance of SVM is the best,and the accuracy rate is 80.12%,which is 4.45% higher than the performance of convolution neural network.It is the most suitable classification algorithm for Uyghur emotional analysis.
Keywords/Search Tags:Uyghur, Sentiment analysis, Sentiment corpus, Machine learning
PDF Full Text Request
Related items