Font Size: a A A

Research On Short Text Emotion Classification Method Based On Word2Vec And N-Gram

Posted on:2019-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y C DuFull Text:PDF
GTID:2428330596964831Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularization and rapid development of social networks,microblogs,online reviews,and social forums are the main platforms on which people are active on the Internet People are socially engaged in the Internet at all times,resulting in massive amounts of short text information.The emphasis is on the monitoring of public sentiment,and companies are paying more and more attention to the feedback on product effectiveness.The campus increasingly pays attention to students' daily life and psychological security.These information can be obtained from social networks.Therefore,by analyzing the sentimental tendencies in these mass data,we can make better decision-making support for governments,enterprises,and schools.Aiming at these problems,this paper proposes a short text classification model based on Word2Vec algorithm.The main work is summarized as follows(1)This paper deeply studies the feature extraction methods in traditional text classification research:information gain,TF-IDF,mutual information to construct a vector space model,and apply these methods to 34,000 microblog datasets on the emotional classification accuracy rate Lower.The short text length is short,it is difficult to extract good features using general feature extraction methods.Therefore,this paper adopts a distributed semantic expression-based word vector model to construct a short text feature vector space(2)This dissertation studies the word vector generation algorithm based on distributed semantic expression,uses Word2Vec algorithm to train a vector library containing 440,000 word vectors,and then proposes a word sequence feature extraction model based on N-Gram algorithm.The vector library extracts the word vector,calculates the feature vector of each word combination in the short text,and finally extracts the feature vector value based on the word sequence.(3)In this paper,we use a variety of sentiment lexicons to combine and filter.Based on the word vectors,we construct an emotional word vector library containing about 25,000 words.We also focus on the expansion of the sentiment dictionary and the performance of the expanded dictionary.(4)This paper constructs an experimental data set by capturing Weibo content of different users from October to December 2015,and uses the standard data set released by NLPCC(Natural Language Processing and Chinese Computing Conference)in 2014 as the data set.Another experimental data set was compared horizontally.This paper compares the dictionary-based classification method and the classification effect of the short text classification method based on Word2Vec proposed in this paper.The experimental results prove the effectiveness of the proposed method.(5)The experimental results show that the method proposed in this paper can deal well with the analysis of short text sentiment tendency.However,this paper does not study the emotional intensity of short text expression.The intensity of emotional expression can better explain the content of a short text.Emotions,therefore,in the next work,we need to study and improve the methods proposed in this article.(6)Based on the short text sentiment classification method proposed in this paper,application development was carried out in two fields.1.Ecological resource allocation decision-making experiment platform:In the process of studying ecological behaviors of resource allocation,the mutual exchange among resource distributors affects their allocation decisions and the final resource allocation results to a certain extent.Therefore,this article is in the resource allocation experiment.On the platform,the application of the proposed algorithm and the resource distributor's communication module,through the sentiment analysis of the distributor's exchange of conversations,gives decision support to the distributor,making the final resource allocation result more reasonable.2.Hangzhou Emotion Map:Due to the rapid development of social networks,public security departments have become more and more important in monitoring public opinion.Therefore,the app uses reptiles to obtain information on Weibo,Baidu Post Bar,and Hangzhou BBS forums,and analyzes the sentiment's published content's sentiment,location,and time information.Eventually,it displays the real-time Hangzhou region through maps.Residents' emotional changes.
Keywords/Search Tags:Short text, word vector, N-Gram, emotional dictionary, sentiment classification
PDF Full Text Request
Related items