Font Size: a A A

Research For Sentiment Analysis Based On Active Learning

Posted on:2020-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:X R WangFull Text:PDF
GTID:2428330599453560Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,a large number of user comment texts expressing opinions and emotions have emerged on the Internet.It is time-consuming and laborious to rely on manual methods to process and analyze the emotional information contained in these texts.Therefore,there is an urgent need for relevant technologies to quickly and automatically process and analyze comment texts.Text sentiment analysis is generated and developed rapidly and it is widely used in the fields of business decision making,viewpoint search,information prediction and sentiment management.This thesis mainly studies the Chinese sentiment classification subtask in sentiment analysis,which aims to automatically judge the sentiment polarity of text.The method based on machine learning is one of the mainstream methods of sentiment classification at present.This method needs to use a large number of marked corpus to train the classification model.However,it is costly and error-prone to manually label large amounts of data.Therefore,it is of great research value to ensure the performance of the classification model while reducing the labeled corpus.Moreover,the coarse-grained sentiment classification task cannot capture the different topics contained in the comment texts and their corresponding emotional tendencies.In view of the above problems,the main research work of this thesis includes:(1)In order to solve the problem of costly and error-prone in obtain a large number of labeled corpora.On the basis of sentiment classification based on machine learning,introduces active learning and combines the active learning method based on query by committee,the Sentiment Analysis method based on Query by Committee(SAQBC)is proposed in this thesis.Some unlabeled samples with high classification information are selected by sample selection strategy for labeling,and these labeled samples are iteratively trained with machine learning model,so as to reduce the workload of sample labeling.(2)For the coarse-grained sentiment classification task,it is impossible to obtain the different topics contained in the comment texts and their corresponding emotional tendency.On the basis of the SAQBC method and the LDA Model,A method of Sentiment Analysis based on Topic Model and Active Learning(SATMAL)is proposed in this thesis.Firstly,the topic information hidden in the comment texts is obtained through the LDA model,and then the emotional polarity is predicted by SAQBC,and finally the different topics and their corresponding emotional tendencies are obtained.(3)Hotel reviews dataset is presented in this thesis to verify the effectiveness of the SAQBC and SATMAL methods,and compares SAQBC with other commonly used sentiment classification models based on machine learning.The experimental results show that SAQBC performance remains the best when the size of the data set is reduced by more than half,and the accuracy rate is 1.45% higher than the best comparison method.At the same time,the experiments show that SATMAL method can excavate potential topic information and corresponding emotional tendency of comment texts in practical application.
Keywords/Search Tags:Sentiment Analysis, Machine Learning, Active Learning, Query by Committee, Topic Model
PDF Full Text Request
Related items