Font Size: a A A

Research On Multi-feature Sentiment Analysis Method Based On Microblog

Posted on:2022-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:S H ChenFull Text:PDF
GTID:2518306329473174Subject:Public health and preventive medicine
Abstract/Summary:PDF Full Text Request
ObjectiveThis paper takes microblog-based sentiment analysis as the research direction and proposes a multi-featured sentiment analysis method by combining lexicon and machine learning,in order to present user sentiment more objectively and realistically and further improve the accuracy of microblog sentiment analysis.MethodThrough the research on the feature selection and sentiment analysis methods at home and abroad,the three feature indicators of topic features,behavior features and text features are selected by combining the characteristics of microblog platform and microblog text content.The LDA topic recognition model and ROST content mining system are used to calculate the sentiment values of topic features of text data,and the sentiment values of behavioral features are calculated by the number of microblog influence and likes,while the relevant sentiment lexicon is reconstructed and expanded as a knowledge base of sentiment constants to complete the calculation of sentiment values of text features,and then the feature set matrix is constructed.Finally,a machine learning model is selected to train the feature set matrix,and the idea of integrated learning is introduced to integrate the classification results through a multi-voting algorithm,and the textual content of a microblog of an unexpected event in microblogs is selected as the research object to classify the sentiment polarity.The effectiveness of the proposed method in this paper is evaluated by accuracy,precision,recall and F1 value.Results(1)In the data acquisition results section,based on the selected target events,this paper obtains text data about the events on the microblogging platform,and crawls a total of 24904 microblog text data.(2)In the model building section,based on the selected multi-feature indicators,the sentiment values of the incorporated topic feature indicators,the sentiment values of the microblog influence and like behavior feature indicators and the sentiment values of the document-level,sentence-level and word-level text feature indicators are calculated respectively,and the feature set matrix is constructed from them.(3)In the result comparison section,the recognition effects of the five machine learning classification methods selected in this paper,SVM,KNN,BP neural network,random forest,and XGBoost,are compared,and it can be seen that the XGBoost model has an accuracy rate of 84.52%,an accuracy rate of 82.57%,a recall rate of 76.95%,and an F1 value of 79.66%,which shows a showed better results.Comparing the sentiment analysis method incorporating single topic,behavior,and text features,and the sentiment analysis method incorporating two parts of features,topic+text features and behavior+text features,with the multi-feature sentiment analysis method proposed in this paper,the multi-feature sentiment analysis method has an accuracy rate of84.52%,an accuracy rate of 82.57%,a recall rate of 76.95%,and a composite index F1 value of 79.66%,all of which outperformed the effectiveness of other sentiment analysis methods incorporating different features for sentiment polarity classification.The multi-feature sentiment analysis method after the integration of results by the multivoting algorithm has the highest values of each index such as accuracy,precision,recall and F1 value for sentiment classification results,and the model shows good results.Among them,the accuracy rate was 94.88%,precision rate was 96.35%,recall rate was 94.50%,and F1 value was 95.42%,which showed the best results in classifying the sentiment polarity of text data.The unintegrated multi-feature sentiment analysis also shows more obvious advantages when compared with the baseline method chosen in this paper,the plain Bayesian sentiment analysis model.Thus,it can be proved that the proposed multi-feature sentiment analysis method in this paper has certain effectiveness.Conclusion(1)According to the characteristics of microblogging platform,combining the theories of emotional infection and network cluster behavior,we selected multi-feature indicators of topic features,behavior features and text features,and proposed a multifeature sentiment analysis method for microblogging text,so that it can be better adapted to the sentiment analysis mode of online social media platform represented by microblogging.Bursts on microblogs are selected for the empirical study of sentiment analysis,and by comparing the multi-feature sentiment analysis method with the baseline and the sentiment analysis methods incorporating different features,the proposed method in this paper has a better effect of sentiment polarity classification,which proves the effectiveness and superiority of the method in this paper.(2)The sentiment analysis method part adopts the combination of sentiment dictionary and machine learning method,different from the traditional combination,this paper realizes the organic combination of sentiment dictionary and machine learning method by reconstructing and expanding the sentiment dictionary as the knowledge base of sentiment constants,which is used as the basis for text feature index screening and sentiment value calculation.It provides a certain reference for the subsequent research on sentiment analysis methods.
Keywords/Search Tags:Sentiment analysis, Multi-features, Machine learning, Microblogging
PDF Full Text Request
Related items