Font Size: a A A

The Sentimental Classification System With Stacked Denoising Autoencoder

Posted on:2018-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y D WangFull Text:PDF
GTID:2348330563452240Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of social media,more and more people comment on the Internet with micro-blog and so on.Their comments have diffident emotions.In order to distinguish the emotional tendency of the review content automatically,sentimental analysis had been produced,and the derived sentimental classification also has a great impact on traditional comment system industry.Traditional sentimental classification methods mainly rely on manual labeling and emotional dictionary.For the amount of network micro-blog,we will waste a lot of time and manpower for labeling them manually.Therefore,we firstly use deep learning to extract feature on micro-blog data,then classify them.To be more specific,the primary work are as follows:Firstly,a sentimental classification system with deep learning is designed and implemented.Using IKAnalyzer2012 to process Chinese words segmentation and remove stopwords,using TFIDF and CHI square test methods to extract features simply,and using word2 vec to expand and generate word vector,we finish data processing.In order to obtain more abundant feature vectors,we propose an approach: when system finish data processing,system first uses stacked denoising auto-encoder to extract features on micro-blog data,and then system uses Softmax classification model to classify operation.Stacked denoising auto-encoder is a kind of deep learning model,which has a better ability to feature expression.In order to verify the difference between deep learning model and shallow learning model,we design a comparative experiment which the word vector input into the shallow classifier with deep learning model and the shallow classifier without deep learning model.Comparing to the shallow classifier without deep learning model,the accuracy has been further improved.Secondly,point to a problem that micro-blog data with a mount of spoken language leads to micro-blog data mix noise data and very sparse.We propose an approach: stacked denoising auto-encoder with sparse factor-we put sparse factor in each hidden layer of the stacked denoising auto-encoder.It can make the features of the pumping maintain better character of input data,thus improving the generalization ability of this model.It was found that the classification accuracy is improved by 5% compared with the previous experiments,and the different selection of sparse factor objects can also affect the final classification results.Thirdly,in order to improve the efficiency of stacked denoising auto-encoder with sparse factor model,a new method is designed and implemented,which makes stacked denoising auto-encoder with sparse factor running on parallel computer clusters based on distributed memory.Thus,the cost of training time in this system is greatly reduced.
Keywords/Search Tags:Deep Learning, Sentimental Classification, Stacked Denoising Autoencoder, Sparse Factor, Distributed Memory Computing
PDF Full Text Request
Related items