Font Size: a A A

Research On Microblog Sentiment Analysis Based On Active Learning

Posted on:2018-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y F GuanFull Text:PDF
GTID:2348330515973961Subject:Engineering
Abstract/Summary:PDF Full Text Request
As an important branch of Text Mining,sentiment analysis has been noticed by lots of scholars.During the fast development of Internet and social media,mess of user generated text has been created,which is subjective and contains obvious sentiment polarity.Now the constraint of those mainstream methods which are based on machine learning for sentiment analysis is their training set needs huge amount of labeled text samples to be established,which is expensive for the labeling.However,it is the unlabeled text samples that is easier to get with relatively fewer money.So,it has been a significant issue that how to take advantage of more unlabeled samples and less labeled samples.To solve this problem,we conbine active learning strategy to text sentiment classification method based on supervised learning.Because of the sparsity of text features matrix,it is better that we take support vector machine(SVM)as base classifier.Margin Sampling is a classic active learning method which using SVM.However,margin sampling has some problems on accuracy and performance such as error passing and meaningless iteration.To solve these problems,we propose a new method also based on SVM which is called Active Learning in Informative Vector Selection(ALIVS).The main work is as follows:First,we make a systematic research about the theory of text sentiment classification and active learning,which includes basic theories of these two research issues,including major tasks and most-used methods.Further more,we analyze the active learning method based on margin detection and find its limit on performance.Based on the research above,we propose a new active learning strategy ALIVS,which creates a concept of informative vector utilizing the property of unlabeled samples and develops a two-phase learning framework using SVM.This framework is based on the ideas below:Construct a two-phase classifier framework:The first-phase main classifier deals with sentiment classification and the corresponding second-phase informative vector classifier leads the selection of the most informative vectors from unlabeled samples to better deal with the sentiment classification problem by using the classifying information of the main classifier;Then we put these informative vectors labeled into the training set of first-phase classifier and start next iteration to continuously enhance the capacity of the first-phase classifier to solve the problem we propose at the beginning.Finally,Experiment were conducted on the scenario of COAE2014 testing:mission 4 and was compared with margin sampling,which is a state-of-the-art active learning method based on SVM classifier.Results show great advantages of the proposed method in terms of precision and reduction of overfitting and error cascade,proving the validity of the new method.At the end,we make prospect to the improvement and development of the new method in the future.
Keywords/Search Tags:Microblog Sentiment Analysis, Active learning, Unlabeled Samples, Support Vector Machine
PDF Full Text Request
Related items