Font Size: a A A

Deep Representation Learning For Sentence Classification

Posted on:2020-01-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q R ZhouFull Text:PDF
GTID:1368330605481319Subject:Intelligent Science and Technology
Abstract/Summary:PDF Full Text Request
Building high-quality sentence representation is the foundation of better performance on sentence classification.Traditional methods for representing the sentence are based on the high-dimensional vectors,which suffer from the problem of data sparsity.Recently,the low-dimensional continuous valued vector representations of sentences learned by deep neural networks can effectively alleviate data-sparsity problem in traditional methods.However,there are still some issues with the above deep representation learning models:1)they learn only one representation for a single sentence,and classify the sentence based on that single representation;2)most of them learn representations by using only class labels,ignoring the structured relations among sentences;3)most of them optimize representations based on the single task of sentence classification,and few studies have tried to utilize the related tasks to improve the sentence representation learning.This thesis focuses on the three above-mentioned problems.Based on the extensive analysis of the related work,three deep representation learning methods which emphasize different aspects on sentence classification are proposed.The main contributions of this thesis are presented as follows.A differentiated self-attention representation learning model is proposed.It can learn two different representations by shifting its attentions on a single sentence.The model is composed of a shared memory,two branches of self-attention sub-classifier with same architecture but parameterized differently and an example discriminator.Based on the proposed differentiated loss,the model can promote two self-attention sub-classifier extract key features from different parts of a sentence,thus resulting in two different representations for a single sentence.Two self-attention sub-classifiers then give their own predictions respectively based on the different sentence representations.Finally,the example discriminator selects the better one for classification.Experimental results on four public datasets and one proposed synthetic dataset demonstrate that the proposed model achieves better performance compared with other self-attention and composition-based classification models,and it can extract different aspects of the sentence into representations.A multi-sample deep representation learning approach based on distance regularizations is proposed.It can explore the structured relations among multiple sentences when learning representations.Two loss functions that based on absolute and relative distance metrics are proposed.Both of them can regularize the structured relations between sentences:1)sentences with different class labels are far away from each other,enforcing the large margin between inter-class sentences in representation space;2)sentences with the same class label should have some cluster structures,discovering reasonable intra-class variations in representation space.Each of two loss functions can be easily employed to the existing softmax-based classification models,optimizing sentence representations with classification loss function.Experimental results on four public datasets demonstrate that the proposed approach consistently boosts the performance on several popular softm ax-based deep classification models,and captures the cluster structures in sentence representation space.A group of hierarchical Long Short-Term Memory(LSTM)networks for joint representation learning is proposed.They can jointly model sentence classification and some other related auxiliary tasks.Each of the models is composed of two-layer LSTM network,with upper-layer LSTM dealing with the task of sentence classification and lower-layer LSTM dealing with an auxiliary task of sequential labeling.For training,the representation learning of every single sentence is simultaneously supervised by above two tasks.Moreover,the networks introduce a hyperparameter to control the interactive and hierarchical information that flow across multiple LSTM layers,balancing the supervision from two different tasks.Experimental results on two public datasets for intent classification and slot filling tasks demonstrate that the proposed models fully utilize the information from auxiliary task when learning representations,thus boost the performance on sentence classification.A prototype system for classifying the sentiment polarity of sentences that based on deep representation learning models is designed and developed.Leveraging the proposed differentiated self-attention model and multi-sample deep representation learning approach,the system has two major functions:1)given an input sentence,it predicts the sentiment polarity of this sentence using the proposed methods;2)it provides rich visualization of some internal processing results in the proposed methods.
Keywords/Search Tags:deep representation learning, sentence classification, self-attention, distance regularization, long-short term memory network
PDF Full Text Request
Related items