Font Size: a A A

Mixed Language Model Based Sentiment Analysis In Sina Microblog

Posted on:2018-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2428330569975106Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of social media sites,microblog has become a gathering place where people can express feelings and opinions freely.Analyzing emotions from the massive microblogs has become a meaningful field,among which,microblog sentiment classification and microblog new word detection are two typical and significant tasks.Sina Microblog is short-text.Traditional lexicon-based sentiment classification depends on the integrity of sentiment dictionary;supervision/semi-supervision method relies heavily on feature selection and combination;deep learning approach requires a combination of detailed grammar and syntax information,with time-cost training and lack of interpretability.New word detection based on combination of rules and statistics requires precise linguistic resources to obtain candidates and fine linguistic rules to filter uselessness.To improve performances above both,some researches are made as follows.Firstly,we propose a novel approach of Sina Microblog sentiment classification method based on mixed language model.This approach construct two classifiers by training the positive/negative mixed language model and then compares the occurrence probability of the same test microblog under these two mixed language model respectively.This approach uses only unigram feature and the training process is fast.The experiment demonstrates that our proposed method is better and more stable than traditional supervised learning method and deep learning joint supervised learning method.Secondly,we propose a new word detection method of Sina Microblog based on likelihood ratio test.This approach utilizes linguistic characters of the web sentiment words,combining with likelihood ratio test and string-related statistics to mine new web sentiment words,with simple linguistic rules(only part-of-speech)and non-supervision.The experiment shows that our method can not only find more new web sentiment words,but also can make the ranking of new word higher,which can promote our performance of sentiment classification.
Keywords/Search Tags:Sentiment Classification, New Word Detection, Mixed Language Model, Likelihood Ratio Test, Part-Of-Speech
PDF Full Text Request
Related items