Font Size: a A A

Research On Sentiment Analysis Of Microblog Text Based On Recognition Of Sentiment New Words

Posted on:2022-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:W T LiuFull Text:PDF
GTID:2518306341955599Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the process of topic discussion on microblog,users pay more attention to the simplified input,casual expression and sentiment output,so they spontaneously change their language expression methods.This change directly increases the output of internet new words,which making it more difficult to analyze the sentiment tendency of microblog texts.In order to solve this problem,this paper proposes a method of sentiment analysis of microblog text based on sentiment new word recognition.First,new words are identified from microblog data by statistical method;secondly,sentiment new words are identified based on the context information and semantic information of the new words;finally,sentiment tendency of microblog text can be judged by combining expanded the microblog sentiment lexicon and multiple rules.The main research contents are as follows:(1)Aiming at the characteristics of new words,a new word recognition algorithm based on improved mutual information is proposed.First,for a case where a word is mistakenly divided into multiple words by the word segmentation tool,N-gram segmentation is performed on the preprocessed microblog text;then candidate new words are identified by combining the improved point mutual information and the left and right adjacent entropy;finally,candidate new words are filtered to obtain a new word set through repetition words and expanded word sets.(2)Aiming at the problem of whether there are co-occurring sentiment words in new words,a sentiment new word recognition algorithm combining keywords between words and cosine similarity is proposed.Firstly,microblog sentiment lexicon is constructed to determine whether a new word has co-occurring sentiment words;then the method of improved Semantic Orientation Pointwise Mutual Information and the method of improved the cosine similarity are used to calculate the extremums of the new word;finally,new words are recognized according to the threshold of sentiment tendency of words.(3)Aiming at the limited number of words contained in the existing basic sentiment dictionary,a microblog text sentiment analysis optimization algorithm that combines expanded lexicon and multiple rules is proposed.First,sentiment words are identified in the microblog text based on the expanded lexicon;then the text in where the sentiment words are located is matched by modifier lexicon and multi-rule;finally,the location characteristics of the sentence are considered and the sentiment extremums of the entire microblog text is calculated.So as to the sentiment tendency of microblog is analyzed.The experimental results show that the microblog text sentiment analysis method based on sentimental new word recognition can not only effectively identify new words and sentimental new words,but also improve the accuracy of the sentiment tendency analysis of the microblog text.This paper considers the multiple constitute patterns of new words to identify new words,and the traditional method of sentiment new word recognition is improved based on the characteristics of microblog text and the deficiencies in the process of new word sentiment recognition,this makes it possible to accurately identify sentiment new words in the massive and complex microblog corpus.The research of this paper expands the sentiment new words into the microblog sentiment lexicon to make the user's microblog sentiment tendency judgment more accurate,which is conducive to the relevant departments to guide the correct public opinion,enterprises to formulate corresponding business strategy and consumers to buy products that are more in line with their own wishes.Figure[16]Table[24]Reference[70]...
Keywords/Search Tags:Microblog, sentiment new words, point mutual information, cosine similarity, sentiment lexicon, multiple rules
PDF Full Text Request
Related items