Research On Feature Representation Based On Sentiment Classification

Posted on:2020-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:K R Yu

Full Text:PDF

GTID:2428330596968165

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In recent years,research on neural networks(deep learning)has developed rapidly.Natural language processing,as a practical field,is one of the main targets of neural networks.Although neural networks have achieved great success in the field of natural language processing,demand and development are endless.There still remains many problems even in the most basic tasks such as text classification and sentiment classification.Sentiment classification task is viewed as a basic natural language processing task,while effect of classifier is sensitive to features.The effectiveness of features extracted and used by the classifier can directly influence the performance of the classifier.Distributed representation of features is one of the core researches of neural networks.It is especially important to use distributed representation modeling in natural language processing tasks,for the discrete nature of natural language features such as words.The background of this thesis is a company's needs of public opinion analysis,and this thesis aims to complete a natural language sentiment classification task.For this purpose,we conduct a series of researches on the feature representation in the sentiment classification area.The main contributions of this thesis are as follows:1.For the task of sentiment classification,we proposed an approach to adapt the word representation(word vector).Most existing word vector learning schemes are decoupled from specific natural language processing tasks and are not further adjusted for specific tasks.In this thesis,we propose a concept of word vector sentiment component.Then we attempt to use the sentiment component to interpret sentiment information carried in word vectors.Finally we utilize additional sentiment lexicon to add information to pre-trained word vectors.RT and IMDB sentiment classification tasks are used to test our word vector adapting method.Compared with transferring initial word vectors,transferring the adapted word vectors has effect improvement on multiple models.2.For processing short text,we propose a multi-level short text sentiment classifier.Word sequence features and bag-of-words features are two main features used in text classification.However,collecting multiple features in the same model simultaneously may make the model being too complex and hard to converge.In this thesis,two basic classifiers are designed: LSTM-Attn classifier for collecting sequence features and DAN classifier for collecting bag-of-words features,and then training and integrating multiple basic classifiers with a masked ensembling learning method.The integrated model is able to utilizing multiple features,while avoiding the problem that a single model is difficult to train.The multi-level short text classifier integrates 5 LSTM-Attn classifiers and 5 DAN classifiers,and finally gets an accuracy rate of 86.213.For processing news text,we propose a multi-level news text sentiment classifier.The news text consists of headline text and body text.The news headlines are general and easy to judge with short text sentiment classifier,but the possibility of misjudgment still exists.We propose a long text classifier composed of short text classifiers,which aims to doing sentiment classification on the news body.The multi-level news text sentiment classifier judge the sentiment polarity of the whole news by using both classification results of news headline and news body.The classifier gets an accuracy rate of 93.66% on a news text test dataset provided by the company.

Keywords/Search Tags:

representation learning, word vectors, transfer learning, sentiment classification

PDF Full Text Request

Related items

1	Sentiment Word Vectors Generation Generation Model Research Based On Deep Learning
2	Research And Application Of Chinese Text Sentiment Analysis Method Based On Transfer Learning
3	Representation Learning Based Word Embedding Extraction And Its Application On Sentiment Analysis
4	The Research On Sentiment Classification Based On The Deep Learning Models For Text Data
5	A Research Of Text Sentiment Classification Based On Deep Learning
6	Text Representation And Classification Based On Deep Learning
7	The Research And Application Of Sentiment Classification Based On Transfer Learning
8	Research On Neural Networks Based Uyghur Word Vectors Representation And Its Application
9	Research On Deep Learning And Transfer Learning In Chinese Sentiment Classification
10	Research On Key Techniques Of Sentiment Analysis Based On Representation Learning