Text Augmentation Method Based On Label Relevance Weight Filtering Mechanism In Sentiment Classification

Posted on:2021-04-20

Degree:Master

Type:Thesis

Country:China

Candidate:L M Shi

Full Text:PDF

GTID:2518306113461884

Subject:Economic big data analysis

Abstract/Summary:

PDF Full Text Request

In the field of supervised learning,deep model training requires a large amount of labeled data,otherwise it is easy to over-fit.data is labelled by people,which requires a lot of time.In order to reduce the workload of annotating data,regular transformations are often performed on text to generate similar text,that is,data augmentation.But data augmentation also brings problems.One of difficulties is that because traditional augmentation of text data randomly selects words for random transformation,augmented data may be inconsistent with original data in emotional inclination and semantics.so augmented data may bring noise to the training of supervised learning models.In order to solve the above-mentioned randomness,the paper aims to select the less relevant text word for data augmentation through the relevance between labels and text words.In the sentiment classification tasks,sentiment labels are highly correlated with words whose emotional inclination are obvious.If the words that are highly correlated with the label are selected for transformation in the random augmentation,the words may be replaced by new words whose emotional inclination are opposite to the labels.It may cause the labels to be inconsistent with the newly generated data.The main work is as follows:(1).The calculation methods of relevant weight: the coefficient of Logistic Regression and the attention score of Label Embedding.(2).In four replacement augmentation methods,the top N(N is a hyperparameter)words with the least relevance to labels are selected for replacement according to the relevance weight(degree)between the sentiment labels and the text words,which called the weighted replacement augmentation.The paper uses the NLPCC2014 dataset to test the weighted replacement augmentation methods.The experimental results show that replacement augmentation methods based on the relevance between labels and text words can effectively improve accuarcy and F1-Score in sentiment classification tasks.

Keywords/Search Tags:

Sentiment Classification, Text Augmentation, Logistic Regression, Label Embedding

PDF Full Text Request

Related items

1	The Application Of Label Embedding In Text Classification
2	Text Classification Based On Label Embedding And Attention Mechanism
3	Research On The Essential Technology Of Multi-Label Chinese Text Classification
4	Research On Capsule Network Text Classification Algorithm Based On Label Embedding
5	Research On Short Text Sentiment Classification Technology
6	Research On Aspect Extraction Method In Text Sentiment Analysis
7	Research And Application Of Text Classification Algorithm Based On Label Embedding And Self-Interaction Attention
8	Research On Feature Generation Methods For Text Sentiment Classification
9	Research On Twitter Sentiment Classification Based On Sentiment Word Embedding And Convolutional Neural Networks
10	Maximal Uncorrelated Multinomial Logistic Regression And Its Application In Large-scale Text Classification