Font Size: a A A

Sentiment Classification In Social Network Based On Feature Autonomic Learning

Posted on:2018-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:F Y LinFull Text:PDF
GTID:2428330542476904Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of the Web text data,mining and analyzing these text data,especially the online review data posted by the users,can greatly help better understand the users consuming habits and public opinions,it also plays an important role in decision-making for the enterprises and the government.There are still some problems in Chinese text sentiment classification.On the one hand,when we use the vector space model to express the massive short text,the data will face the problem of multidimensional catastrophe.At the same time,the vector space model regards words as atomic units,the relationship between the characteristics of the word can not express of the context semantics well;on the other hand,due to the large number of texts,the training process time consumption is very large,it's difficult to achieve the desired results.In order to solve the above problems,this paper proposes a general framework of feature generation and sentiment classification.The feature extraction includes low-level feature extraction and mid-level feature extraction.The low-level feature extraction uses the chi-square statistics to calculate the relevance of the word to build the sentiment lexicons instead of using all words.The significance of this module is to extract the most representative features from original text,it also helps to reducing the dimension of text in expression.However,relying on the term frequency statistics based on the low-level features is still not good enough to express the semantic relations between feature words.This thesis adds a part of mid-level feature extraction.The significance of this part is that the feature can be autonomously extracted.The mid-level feature extraction is through the unsupervised way and it achieves distributed representation of characteristic words through a neural network.Combined with our proposed methods especially the Pooling representation,the problem that there is no notion of similarity between words is solved.In the module of sentiment classification,a single hidden layer neural network is used as a classifier and we use Extreme Learning Machine(ELM)algorithm to train the classifier.Compared with the traditional classifier,ELM algorithm has the advantages of good generalization ability and not easy to fall into local extremum.The contributions of this thesis are as follows:After the distributed representation of the words learned in the mid-level feature,combined with our proposed methods especially the Pooling representation,not only we can reduce the dimensionality of the text data and also we can solve the problem that there is no notion of similarity between words.Secondly,this paper selects the ELM algorithm to construct the classifier,which can improve the efficiency of the sentiment classification under the mass data while ensuring the precision.The general method proposed by this thesis is evaluated in the Chinese hotel commentary from social network and the microblog data provided by SINA.Experimen'tal results show that precision of the data has a slight improvement after mid-level feature learning which compared with the single low-level feature,and the feature dimension is significantly reduced.The network structure is simplified while the recognition performance is still high.
Keywords/Search Tags:sentiment classification, feature extraction, extreme learning machine(ELM)
PDF Full Text Request
Related items