Font Size: a A A

Recurrent Neural Network With Multi-scale For Sentence Classification

Posted on:2022-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z X LinFull Text:PDF
GTID:2518306569481004Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Sentence classification is a fundamental and important task in the field of Natural Language Process(NLP),which is widely used in many subareas of NLP,such as sentiment analysis,question classification,spoken dialogue,and so on.The target of this task is to assign a predefined label to the input sentence.The central problem of sentence classification is to understand the semantic meaning of a sentence by some keywords or phrases located at different positions.Although many current deep neural network-based methods have achieved promising performance on sentence classification tasks,there are still the following problems: the methods based on Convolutional Neural Networks(CNNs)are difficult to model the non-consecutive dependency and ignore the sequential information,and the methods based on Recurrent Neural Networks(RNNs)tend to ignore local features.For this reason,some methods combine the two in ways of stacking multiple layers or additionally attaching the attention mechanism to improve the classification performance of the model by increasing the complexity of the model.However,these methods easily lead to feature redundancy or overfitting,especially only relatively small training sets are available for sentence classification tasks.Therefore,how to effectively integrate the characteristics of different neural networks,improve the model generalization,and adequately capture the multi-scale features of sentences is still a challenging problem.This paper explores a novel multi-scale recurrent neural network,called Multiscale Orthogonal Independent LSTM(MODE-LSTM),which absorbs the local feature modeling of CNN and non-consecutive dependency modeling of RNN.MODE-LSTM has minimal effective parameters,considers multi-scale features of sentences and achieve a good balance between capability and complexity.First,an Orthogonal Independent LSTM(ODE-LSTM)is designed to disentangle the hidden state into multiple independently updated small hidden states to reduce the model complexity and improve the generalization performance.Furthermore,an orthogonal constraint penalization loss is applied to improve the diversity of features.Subsequently,a sliding window mechanism is introduced to limit the recurrent transmission step lengths of ODE-LSTM,so that the recurrent transition is only performed in a local window sliding along sentence.This way can extract the local phrase features of sentences while retaining the non-consecutive dependency modeling ability of RNN.Finally,using multiple scale windows in parallel with ODE-LSTM to extract n-gram features.Experiments on eight sentence classification datasets have achieved good results.The classification accuracy can be further boosted by combining pre-trained contextual representation.In addition,a large number of experimental analyses have also verified the generalization of model and the effectiveness of multi-scale representation.
Keywords/Search Tags:Sentence Classification, Recurrent Neural Network, Long Short-term Memory Network, Multi-scale features, Sliding window mechanism
PDF Full Text Request
Related items