Font Size: a A A

The Research Of Sentiment Analysis Technology For Chinese Microblog

Posted on:2020-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:H TianFull Text:PDF
GTID:2428330578955874Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,many excellent Internet applications like microblog have emerged on the market.Microblog,with its characteristics of openness,originality,convenience,has attracted a large user group in a very short time and become one of the most popular "regions" for users to speak out.The advent of the era of big data has led people to gradually discover the potential value of massive microblog data,and it became the hot spot of many scholars to understand users' emotions on current topics.The microblog sentiment analysis mainly discriminates the emotional tendency of microblog.The discriminating methods mainly include approach based on machine learning and the approach based on sentiment dictionary.Both methods are involved in the paper.This paper has carried out in-depth research from two aspects: expanding the emotional dictionary and improving the emotional classification method.The main work contents are as follows:(1)By prescribing data denoising rules,expanding the user-defined vocabulary of the word segmentation tool,removing the stop words and other steps to complete the pre-processing of the microblog data.Using the voting mechanism and priority mechanism to integrate the HowNet Dictionary,the National Taiwan University Sentiment Dictionary,and the Chinese Emotional Vocabulary Ontology Library of Dalian University of Technology,the basic emotion dictionary was constructed.(2)In terms of the expansion of the dictionary,this paper adds the idea of case-based reasoning commonly used in the field of artificial intelligence to the method of emotional dictionary expansion based on word vector,and proposes the C-word2 vec model.The original word2 vec model does not change the basic sentiment dictionary used in the whole process of identifying new sentiment words,and the C-word2 vec model adds the newly recognized sentiment words to the basic sentiment dictionary,which improves the recall rate of emotional word recognition.(3)In terms of the formulation of rules,this paper comprehensively considers the influence of negative words,degree adverbs and expressions on the classification of microblog emotions,and formulates corresponding emotional scoring rules.The experimental results show that the method considering the rule is more accurate than the method without considering the rule.(4)Under the framework of machine learning sentiment classification method,the sentiment dictionary and rules are introduced,and a comprehensive sentiment classification method is proposed.This method allows the extraction of features not to be limited to the already labeled data,but to retain more inter-word semantics in the extracted features.The comprehensive method was verified on the microblog dataset,which proved its validity and feasibility.
Keywords/Search Tags:Microblog, Sentiment Analysis, Sentiment Dictionary, C-word2vec, SVM
PDF Full Text Request
Related items