Research On Sentiment Classfication Method Based On Semi-supervised Learning

Posted on:2020-01-17

Degree:Master

Type:Thesis

Country:China

Candidate:W T Liu

Full Text:PDF

GTID:2428330575456413

Subject:Information and Communication Engineering

Abstract/Summary:

With the rapid development of Internet technology,more and more users are keen to comment on the web.If we can automatically dig out the emotional tendencies contained in these subj ective texts,it will have great application value and economic value to individuals,enterprises,and the government.Text sentiment classification technology is the most effective tool to solve this problem.As a universal machine learning technology,semi-supervised learning can make full use of unlabeled samples to improve classification performance.In this fact,many scenes of text sentiment classification are faced with insufficient corpus,and the labeling of samples is time-consuming and laborious,this paper focuses on the semi-supervised learning sentiment classification.The main innovations and work of this dissertation are as follows:Firstly,this paper proposes a collaborative training sentiment classification algorithm based on stratified sampling random subspace.The algorithm adopts stratified sampling method to construct subspace,and improves the semi-supervised learning algorithm of random feature subspace directly applied to the text sentiment classification.The subspace of some parts may not contain strong correlation attributes.The algorithm effectively improves the sufficiency of each subspace while ensuring the diversity of subspaces.Experiments show that compared with the semi-supervised learning algorithm based on random feature subspace and other commonly used semi-supervised learning algorithms,the classification effect of the algorithm is better.Secondly,a semi-supervised sentiment classification algorithm based on diversity and high confidence estimation is proposed.In the process of iterative training,the incremental self-training algorithm is easy to introduce mislabeled samples.The proposed algorithm combines the posterior probability and prior distribution information of the sample to improve the problem.In order to avoid the concentration of selected sample distribution,which will produces the data space is inconsistent with the real distribution,the algorithm adopts diversity metrics to ensure mutual differences.Experiments show that compared with some commonly used incremental semi-supervised learning algorithms,the proposed algorithm has better classification performance.

Keywords/Search Tags:

sentiment classification, semi-supervised learning, stratified sampling, high confidence, diversity metrics

Related items

1	Text Emotion Analysis Technology Based On Semi - Supervised Machine Learning
2	Sentiment Classification Research Based On Semi-supervised Learning
3	Research On Several Algorithms And Theories In Diversity-Based Semi-Supervised Learning
4	Based On The Positive And Unlabeled Samples, Semi-supervised Classification
5	The Design And Prototype Implementation Of Sentiment Analysis System Based On Semi-supervised Learning
6	Research On Semi-supervised Classification Algorithm Based On Temporal Relationship Learning
7	Research On Sentiment Classification Based On Co-training In Semi-supervised Learning
8	Semi-supervised Sentiment Classification Based On Ensemble Learning With Voting Combination
9	Semi-supervised Learning And Active Learning Of Sentiment Classification Coupled With Domain Knowledge
10	Sentiment Classification With Semi-supervised Ensemble Learning