Font Size: a A A

Research On Web Text Sentiment Analysis Method

Posted on:2021-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2428330611497422Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Emotion analysis is also called opinion mining.The recorded subjective text often has emotional color,and the task of emotion analysis is to identify the emotional tendency in these texts.Network text emotion analysis is one of t he main tasks of natural language processing,including corpus data processing,emotion lexicon construction,emotion tendency analysis and so on.There are two common methods for emotion analysis.One is based on emotion lexicon.This method has high accu racy,but low recall rate and certain limitations.The method based on machine learning does not need much information in the field of linguistics,but it needs to annotate massive data,which makes it difficult to realize.Through in-depth study and resea rch of the current domestic Chinese network text emotion analysis technology,it is found that the current network text emotion classification method still has the following shortcomings.The general emotional lexicon is not applicable to a specific field,because different fields have different professional words,so the general emotional lexicon cannot recognize.For example,as a type of network text,there is still no complete official emotion lexicon for film network commentary,which leads to the unsa tisfactory effect of emotion analysis with emotion lexicon,and it is unable to accurately and fully explore the emotional information of the text.Word2 Vec is an efficient tool for representing words as real value vectors,but the Word2 Vec model cannot dis tinguish the importance of words in text.In the emotiona l analysis of network text,considering that there are always center words in Chinese text,these center words have more influence on the emotional tendency of text than other words.The existing wor d vector representation lacks the difference of the importance of words,which is not consistent with cognition.In view of the above shortcomings,this paper designs a method of emotion analysis combining domain emotion lexicon and machine learning.The m ain work is as follows:(1)Improve the SO-PMI algorithm,introduce the distance relation between co-occurrence words when calculating the mutual information of PMI points,and get the SO-LPMI algorithm.(2)Put forward the concept of semantic weighted wor d vector.In the calculation of the feature weight of words,the semantic factor is introduced to improve the TF-IDF weight calculation to obtain the LOCTF-IDF weight calculation method,which highlights the semantic information of the text.(3)The special emotion lexicon in film field is obtained by the way of SO-LPMI algorithm to expand the lexicon.The general lexicon and the movie domain emotion lexicon constructed in this paper are used to classify the same corpus and compare the classification result s.LOCTF-IDF weight calculation method is used to weight Word2 vec word vectors,to improve the weight of important position emotional words in the text,and support vector machine classification experiment is used to prove the effectiveness of this method.
Keywords/Search Tags:web text, emotion lexicon, support vector machine, semantic weighted word vector
PDF Full Text Request
Related items