Font Size: a A A

Research On The Method Of Sensitive Word Vector And Sentiment Classification Based On Deep Learning

Posted on:2018-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:S J ZhangFull Text:PDF
GTID:2348330533466272Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Sentiment analysis is the recognition of the emotional polarity (positive, negative or polar)or emotional intensity (strong or weak) of a given text or a fragment (e.g., sentences, phrases or words). Sentiment analysis can be applied in product review analysis, can identify the user's product design emotions, for business and product designers to provide decision support. Most of the previous studies used artificial extraction features and traditional machine learning algorithms to construct identification systems. However, the manual extraction of the characteristics of experts in the field of knowledge, the system is poor practical, high labor costs. In recent years, researchers began to use the depth of learning methods to automatically extract features, the depth of learning in natural language processing is the most basic one of the research results is the word vector, that is, the distribution of words, and in many natural language processing has been applied. However, the traditional word vector is based on the study of contextual language, including only semantic and grammatical information, and the emotional information of the word is essential for the emotional analysis task. Most of the existing word-based learning methods can only be used for the grammar of words Environment modeling, but ignores the emotional information of words, it is not a good solution to the task of emotional classification.In order to solve this problem, this paper first proposes a vector training model based on the depth of learning, using two simple strategies to combine the emotional information in the text with the context words of the current word .In order to verify whether the learner's emotional word vectors accurately contains the semantic information of emotion and contextual words, this paper designs the emotional word vectors in different languages and different fields,and performs quantitative experiments at the word level.To expand semantic word vectors from words to the long text, this paper proposes an autonomous deep confidence network algorithm based on semi-supervised learning theory,which combines the adaptive depth confidence network and the active learning method to solve the problem of semi-supervised learning method Emotional classification sample selection problem, and with the same deep architecture for semi-supervised learning and active learning,so that the deep structure in the process of active learning iterative training, and gradually improve the ability to abstract and classification.For massive text data, HDFS is used to realize the distributed storage of the web text data in order to improve the efficiency of the sentiment classification. The text preprocessing and the parallel optimizing of deep belief network are implemented using Spark. The experiments show that the distributed deep belief network can greatly reduce the training time and accelerate the computing speed.
Keywords/Search Tags:setiment analysis, word vector, sentiment word vector, deep learning, Spark parallel computing
PDF Full Text Request
Related items