Font Size: a A A

Research On Sentiment Analysis Methods In Tibetan Texts

Posted on:2018-12-09Degree:MasterType:Thesis
Country:ChinaCandidate:M M LiFull Text:PDF
GTID:2358330533955052Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As an important branch of natural language processing,public opinion analysis has become more and more concerned in recent years.Andwith the development of Tibetan information technology,Tibetan is also moving towards the natural language processing era,whilesentiment analysis for Tibetan text has become a new popular research topic.However,as a late starter,there are several aspectsto be perfected in the topic.Based on the researches of sentiment analysis at home and abroad,considered the grammatical features of Tibetan,this paper proposed "A Hierarchy-based approach of sentiment analysis for Tibetan text",which dividedTibetan sentiment analysis into three levels of word-level,sentence-level,and discourse-level.Based on existing resources,we put forward different researches for each level,and designed systems respectively to realize and validate.The main worksare as follows:First is the word-level.Aiming at the lack of sentiment lexicon,this paper collected a sentiment dictionary manually,which covered a basic emotion dictionary,a degree adverb table,a negative and double negative vocabulary,and a transition vocabulary? And then we validated several word-embedding-based approacheson emotional lexicon extending,found that the optimum method is KNN.Finally we got a more practical Tibetan sentiment dictionary by extending lexicon from corpus automatically based on KNN and word embedding.Second is the sentence-level.In order to compute the Tibetan sentences' Emotional tendencies,this paper summarized the linguistic characteristics of Tibetan sentences and abstracted a three-level ruleset which includessentence-rules,clause-rules and phrase-rules,and then designed a system of Tibetan sentence tendency analysis based on sentiment lexicon and the rule set.Third is the discourse-level.Considering the difficulty to set up tagging corpus,this paper made a preliminary annotation on the original corpus based on the method of sentiment lexicon,and then build SVM model with the corpus after manual filtering.Different from the traditional bag-of-words model,we combinedsentimentfeatures were used in the model training so as to reduce dimension and capture sentiment-related features.Based on the researches above,this paper has achieved the following results:First,proposed the three-layer framework of the Tibetan sentiment analysis;Second,brought the theory of word embedding into the construction of Tibetan sentimentdictionary,improvedthe traditional approach based on similarity computation,and obtained a better Tibetan sentiment dictionary;Third,summed up the rules for Tibetan sentences' sentiment analysis from grammatical features,set up sentence rules,clause rules and phrase rules,designed and implementeda system of Tibetan sentence tendency analysis based on sentiment lexicon;Forth,not only realized the Tibetan discourse emotion analysis method based on sentiment lexicon,but also verified the superiority on sentiment analysis of the SVM model based on combined emotional characteristic.
Keywords/Search Tags:Public opinion analysis, Tibetan sentiment analysis, Tibetan word embedding, Emotional tendency, SVM
PDF Full Text Request
Related items