Font Size: a A A

Sentiment Classification Based On Machine Learning

Posted on:2018-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:G L XinFull Text:PDF
GTID:2348330515466793Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Due to the outstanding performance of machine learning,sentiment classification method based on machine learning has become the mainstream method in the task of text sentiment classification.Extracting features from text and training classifier are two key issues of sentiment classification method based on machine learning,which directly determine the performance of model for sentiment classification.Most researchers mainly pay attention to the extraction of simple lexical features,while the semantic features are ignored,which proved to be important in many natural language process tasks.In addition,the function used in traditional machine learning algorithm to model is relatively simple,belonging to shallow learning.This kind of algorithm is restricted in the expression of complex functions and the generalization ability of model is poor.At the same time,many traditional methods based on machine learning for sentiment classification use small dataset for evaluation.These methods do not consider the problem of parallel efficiency and can't be used to deal with massive data in real production environment,so the practical value is low.In order to solve the problems existing in sentiment classification method based on machine learning,this paper summarize the existing research methods and research on the following respects:(1)In this paper,a sentiment classification method based on semantic features and multilayer perceptron classifier is proposed.In order to make full use of the semantic features in the text,the method extracts the semantic features of text based on Word2vec.Taking the semantic features of text as input,a multilayer perceptron classifier based on deep learning is trained to classify the text into positive and negative class.The classification algorithm based on deep learning can better describe the rule of the sample and can solve the problem that the expression ability of traditional machine learning algorithm is limited.This can improve the generalization ability of the model.(2)In this paper,a sentiment classification method based on Feature Fusion and Model Fusion is proposed.Firstly,the lexical features and semantic features are extracted from different perspectives.And then,Gradient Boosting Decision Tree classifier(GBDT)and multilayer perceptron classifier(MLPC)are trained to classify the text,respectively.Finally,taking the output of classifiers as input,we use the Logistic Regression Algorithm(LR)to train the final sentiment classifier and classify the text.The sentiment classification method based on Feature Fusion and Model Fusion solves the problem that the expressiveness of a single model and the single feature is limited.(3)The sentiment classification methods proposed in this paper are implemented with the parallel framework of Spark that based on memory.The sentiment classification based on Spark can make full use of the resource advantage of Spark cluster,it can improve the parallelism of computing task and realize the processing of massive data,so as to increase the practical value of sentiment classification methods.In this paper,we evaluate these two kinds of sentiment classification methods proposed in this paper using dataset of open source.The experimental results verify the validity of these two kinds of sentiment classification methods.
Keywords/Search Tags:Sentiment Classification, Semantic Features, Deep Learning, Ensemble Learning
PDF Full Text Request
Related items