Algorithms For Recurrent Neural Networks With Long-term Memory

Posted on:2021-12-05

Degree:Doctor

Type:Dissertation

Country:China

Candidate:W Luo

Full Text:PDF

GTID:1488306512954249

Subject:Electronic information technology and instrumentation

Abstract/Summary:

PDF Full Text Request

While depth brings modeling power to neural networks,it also leads to a series of gradient is-sues,especially the vanishing gradient problem.Learning to extract long-term dependencies from sequential data is difficult for deep recurrent neural networks(RNNs)due to the gradient issues resulting from depth in time and space.To overcome the difficulties,this thesis theoretically an-alyzes RNN model's gradient problems during training and proposes models and algorithms that can efficiently learn long-term dependencies based on this analysis.First,to tackle the training difficulties caused by RNN's inherent temporal depth,this the-sis combines the common basic structure between the Long Short-Term Memory(LSTM)and its variants with the idea of partitioning state units into groups and proposes the Grouped Distributor Unit(GDU).During state transition,the proportion of memory units to be overwritten in GDU is limited to a fixed constant for each group,and all units are allowed different updating paces.Thus,the proposed model can latch long-term information flexibly,facilitating the learning of long-term dependencies.Experimental results demonstrate that although GDU is more compact than LSTM and its variants,it has a stronger long-term memory capability.Next,this thesis resolves the contradiction between the RNN's spatial depth and long-term memory ability.The reading and writing operations used in memory augmented neural networks(MANNs)are taken to interconnect multiple recurrent highway transitions,based on which the Recurrent Highway Network with Grouped Auxiliary Memory(GAM-RHN)is proposed.This method protects the long-term information through the local addressing mechanism.While guar-anteeing the spatial depth during forward-propagation,GAM-RHN provides a shortcut for error signals to propagate back in time.In this way,the depth in time and space dilemma is solved.Ex-perimental results show that being deeper in space can enhance GAM-RHN's representation power while retaining its long-term memory.Finally,the relationship between network sparsity and long-term memory is discussed.This thesis extends the Lottery Ticket Hypothesis(LTH)to tasks involving long-term memory,empiri-cally showing that a randomly initialized,dense RNN contains a subnetwork with a good gradient property.Therefore,this thesis argues that properly pruning an RNN can facilitate its long-term memory.Further,in LTH-related trials,this thesis found that pruned weights with a small mag-nitude often experience symbol flips during training.Based on this finding,this thesis presents a neural network pruning algorithm using the weight-flipping rate as the criterion.This thesis regards this algorithm as a regularization technique and shows experimentally that it can improve network generalization and the long-term memory ability of RNNs.

Keywords/Search Tags:

Deep Learning, Recurrent Neural Networks, Long-Term Dependency, Deep State Transition, Neural Network Pruning

PDF Full Text Request

Related items

1	Research On Some Key Problems Of Recurrent Neural Networks
2	Research On 3D Model Retrieval Technology Based On Deep Learning
3	Research And Implementation Of Lipreading Recognition Based On Deep Learning
4	The Study Of Pruning Methods Of Deep Neural Network
5	The Research Of SDN Traffic Prediction Based On Deep Learning
6	Research On Network Anomaly Detection Based On Deep Learning
7	Exemplar-based Texture Synthesis And Its Applications Research
8	Handwriting Word Retrieval Algorithms And Applications To Historical Documents Using Deep Learning Method
9	Research On Algorithms Of Speech Sentence Recognition Based On Deep Learning
10	Network Traffic Classification Based On Deep Learning And Research On Intrusion Detection