Font Size: a A A

Research On Algorithm Of Sentiment Classification Based On SVM And Deep Learning

Posted on:2017-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y HuangFull Text:PDF
GTID:2348330533950176Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, the traditional life ways and commercial structure have been changed drastically. From e-commerce, social software, and then taxi APP, Internet is everywhere. People contact each other, show themselves and make comments through WeChat, micro-blog and other social networking tools. As a result, mass data have been produced by these applications, which contains a large number of views and opinions that have inestimable value. This makes large text data processing become a very popular research area. Sentiment classification based on text data is one of the main related research fields. The main research content in this thesis is about text sentiment classification. For current Chinese sentiment analysis, the related recognition methods based on machine learning usually use statistics-based feature which cannot analyze complex sentence effectively and also cannot deeply reflect text semantic.For the limited ability of complex sentence analysis, a feature extraction rule for various complex sentences is constructed in this thesis, and a new text sentiment recognition method based on SVM and complex sentence is proposed in this thesis. In the experiment, emotional words, POS and negative word features are combined together, followed by adding conditional sentence and turning sentence features,then different classifiers and kernels are adopted for experiments, finally the proposed method achieves the best classification rate of 90.12%. At the same time, the experimental results indicate the proposed method is sensitive to specific features by manually designed, so this method is not robust enough to cover all the information.To solve above mentioned problems, the Word2 vec tool based on deep learning is adopted in this thesis, which method can train the low dimensional word vector that contain deep semantic information. For this method, word vector trained by Word2 vec is adopted as feature, and fusion with frequency weighted feature trained by TF-IDF. Then the SVM is used for training and obtain an ideal performance. By further adjusting the penalty parameter C, when C=10, the best accuracy is as high as 94%, which is improved by 3.23% compared with the method based on SVM and complex phrasing. Meanwhile, the method of Hash mapping feature fusion with word vector is proposed, which also achieves good performance.Through research in this thesis, the use of traditional statistical feature plus complex sentences feature improve the accuracy of 7.16% compared to only using statistical feature combinations. Furthermore, deep learning is adopted by using word vector, which improve the accuracy drastically by 4.25% after the statistical feature fusion, and also for the evaluation index of positive comments. Based on above research, a text sentiment analysis system is designed and developed in this thesis, which include data preprocessing, segmentation, sentiment classification, result displaying and other modules.
Keywords/Search Tags:sentiment analysis, SVM, Word2vec, deep learning, complex phrasing
PDF Full Text Request
Related items