Font Size: a A A

Research On Sentiment Analysis Based On Word Vector Expansion Technology

Posted on:2019-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:M WangFull Text:PDF
GTID:2428330548474407Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Natural language processing is an interdisciplinary subject of computer science and linguistics.It mainly deals with how computers handle and analyze natural language.Sentiment analysis(SA)is an area of knowledge that involves people's opinions,emotions,assessments,and attitudes toward specific entities.The expression of emotional value can be a discrete category,such as positive,negative,neutral,or it can be a continuous emotional intensity.Sentiment analysis mainly extracts the subjective emotional information of reviewers from datasets(Twitter,Weibo,Post Bar Forum,e-commerce site reviews,etc.),which is important for analyzing social media and predicting public opinions on the Internet platform.At the same time,it is conductive to businesses or the media to grasp the user's preferences.With the development of deep learning,in the field of natural language processing especially the analysis of sentiment,there have been many new results in the research direction.This thesis analyzes and studies the existing sentiment analysis techniques and models,and combines the lexicon representation and vector representation of words then uses the deep learning model to verify the idea.This thesis mainly has three aspects :1.Preprocessing the raw Twitter data,due to the social nature of Twitter.Emoticons,hashtags,word abbreviation,web address and punctuation are needed to be dealt with.Besides,stop words are removed and all words are converted to lowercase.2.Transforming the input into a two-dimensional matrix based on word embedding so that the deep learning model can process the input.Each Twitter data is a two-dimensional array superimposed on the data by the corresponding word vector.Under the condition of annotations dictionary,the characterization of words is expanded so that the representation of each word in different contexts is more complete.3.With the previous work as a foundation,the next step is to use deep learning models to train and learn data.The main models tested in this thesis include convolutional neural networks(CNN),long short-term memory artificial neural networks(LSTM),and bidirectional long short-term memory artificial neural network(BiLSTM),then results are combined with the ensemble learning method.The experimental results show that the method presented in the thesis is obviously superior to other methods in three different tasks.
Keywords/Search Tags:sentiment analysis, deep learning, word embedding, CNN, ensemble
PDF Full Text Request
Related items