Font Size: a A A

Research On Text Sentiment Classification Based On Machine Learning

Posted on:2017-10-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q Q ZhangFull Text:PDF
GTID:1318330566955713Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The widespread use and development of Internet made the subject resources such product reviews,news reviews and stock reviews generated massively.There are a lot of information contained in these subjective texts.How to process and use of these resources is one of problems in information management.The technology of sentiment classification provides a way to resolve this question.Sentiment classification technology classify the massive subject text into different categories according to the different sentiment polar expression automatically.Text sentiment classification could get the user's preference in business intelligence,know about the public opinion in the e-government,and know the stock market in information prediction.By far,although the research on sentiment classification for English has made a lot of improvements,it's not too much for Chinese.Based on the background,the paper is focus on the Chinese sentiment classification by machine learning method.Based on the analysis of Chinese grammar and summarization of machine learning methods,the paper will focus three aspects include Chinese text representation,the problem of high dimension of text representation and classification model.Main researches and innovative works are summarized as follows:(1)Triple dependency feature based on dependency parser for Chinese text was build.It's the first step to transform the text into the structured data format.The common text representation is lack of modification relations between words.Through making use of the merit of dependency relation between words,the dependency tree of the sentence is transformed into structure feature vector.Based on the research of Chinese dependency syntax and Chinese Grammar,the original dependency tree was pruned using the process of merging nodes and delete nodes.The algorithms for merging nodes and deleting nodes were provided.For evaluating the efficiency of triple dependency feature,the triple dependency feature was compared with some other feature representation.These methods were implemented into Chinese reviews.The experiment conclusion is that the triple dependency feature is efficient in the sentiment analysis task and could get the better classification accuracy than the common features.(2)A random subspace method based on binary particle swarm optimization was proposed.High feature dimension is detrimental to the classification system.It will reduce the classification accuracy and generalization.Random subspace method partitioned the huge dimension feature space into a lot of small feature subspaces to reduce the original dimension by bootstrap.Meanwhile,binary particle swarm optimization algorithm was used to select the base classifiers based on their results.The experiment was performed with the support vector machine as the base classifier algorithm.And the conclusion is that the random subspace based on binary particle swarm optimization is a good method to feature dimension reduction and keep up the classification accuracy and ensemble system diversity.(3)A machine learning method combined with meta-learning and deep learning was proposed.Based on the meta-learning theory,the base classifiers which were trained by random subspace based on binary particle swarm optimization were used as the training sample of meta-learning.Deep belief network was used as the classification algorithm.By integrating the strength of meta-learning with deep learning which was capable of nonlinear mapping,this paper gave out the theoretical framework and algorithm flow based on deep belief network and meta-learning.An experiment was evaluated by implementing the method into reviews,comparing the influence on text sentiment classification of these two different methods.And the conclusion is that the deep belief network based on meta learning not only reduced the running time but also improved the accuracy further.
Keywords/Search Tags:machine learning, sentiment classification, dependency syntactic parser, random subspace, meta-learning
PDF Full Text Request
Related items