Font Size: a A A

Sentiment Analysis Method Based On User-defined Dictionary

Posted on:2020-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:B HeFull Text:PDF
GTID:2428330596475456Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the increasing number of Internet users,the data that social platforms can obtain is also growing,and Weibo is one of the popular social platforms.As a social media,Weibo provides platform sharing,and users can share their feelings and thoughts on certain topics.The hot topics of Weibo are generally emerging focus events that immediately attract more followers and more online attention,which provides a unique opportunity to combine public sentiment with events of interest to these users.Topic clustering,sentiment analysis and public opinion analysis have always been the hotspots of natural language processing.Based on the existing research,this thesis studies and proposes new research methods to mine,analyze and visualize Sina Weibo data.The work is as follows:First,mine the microblogs of known topic keywords.The existing topic clustering method is to find that the topic and the topic clustering are performed simultaneously without knowing the topic keyword,such as popular microblog topic discovery and clustering.This article discovers and expands related Weibo under known topic keywords.The existing topic mining method cleverly uses the "#" tag unique to Sina Weibo,and uses the hierarchical clustering algorithm to effectively cluster the microblogs with the "#" tag,but ignores most of them.Message without the "#" tag.On this thesis,based on topic clustering with the "#" tag,you can expand the microblogs without the "#" tag of the same topic that can be mined,and use this method for microblog theme crawlers.Second,the Bi-directional LSTM microblog text sentiment classification based on user-defined dictionary and attention mechanism.The existing Bi-directional LSTM text classification method based on the attention mechanism,if using the text representation method of word2 vec,will only consider the case that the context is too small,and does not consider the global statistics,thus adding the text representation method using GloVe,and the above two None of the methods consider the influence of part of speech on text classification.In this thesis,a typical two-way LSTM text classification method based on attention mechanism is added to the custom dictionary,and the text representation method of word2 vec,Glove and part of speech three-word vector is used to improve the neural network structure.Third,real-time microblog topic message mining and sentiment analysis systems.Real-time microblog theme crawler system,including keyword microblog message crawling,microblog message searchable,sentiment analysis result graph,message domestic distribution and other functions visualization.The system considers data acquisition,database storage,data analysis,data display,software analysis,system design,system implementation,and system testing.
Keywords/Search Tags:microblog, topic clustering, sentiment classification, word vector, Bidirectional LSTM
PDF Full Text Request
Related items