Font Size: a A A

Algorithms For Recurrent Neural Networks With Long-term Memory

Posted on:2021-12-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:W LuoFull Text:PDF
GTID:1488306512954249Subject:Electronic information technology and instrumentation
Abstract/Summary:PDF Full Text Request
While depth brings modeling power to neural networks,it also leads to a series of gradient is-sues,especially the vanishing gradient problem.Learning to extract long-term dependencies from sequential data is difficult for deep recurrent neural networks(RNNs)due to the gradient issues resulting from depth in time and space.To overcome the difficulties,this thesis theoretically an-alyzes RNN model's gradient problems during training and proposes models and algorithms that can efficiently learn long-term dependencies based on this analysis.First,to tackle the training difficulties caused by RNN's inherent temporal depth,this the-sis combines the common basic structure between the Long Short-Term Memory(LSTM)and its variants with the idea of partitioning state units into groups and proposes the Grouped Distributor Unit(GDU).During state transition,the proportion of memory units to be overwritten in GDU is limited to a fixed constant for each group,and all units are allowed different updating paces.Thus,the proposed model can latch long-term information flexibly,facilitating the learning of long-term dependencies.Experimental results demonstrate that although GDU is more compact than LSTM and its variants,it has a stronger long-term memory capability.Next,this thesis resolves the contradiction between the RNN's spatial depth and long-term memory ability.The reading and writing operations used in memory augmented neural networks(MANNs)are taken to interconnect multiple recurrent highway transitions,based on which the Recurrent Highway Network with Grouped Auxiliary Memory(GAM-RHN)is proposed.This method protects the long-term information through the local addressing mechanism.While guar-anteeing the spatial depth during forward-propagation,GAM-RHN provides a shortcut for error signals to propagate back in time.In this way,the depth in time and space dilemma is solved.Ex-perimental results show that being deeper in space can enhance GAM-RHN's representation power while retaining its long-term memory.Finally,the relationship between network sparsity and long-term memory is discussed.This thesis extends the Lottery Ticket Hypothesis(LTH)to tasks involving long-term memory,empiri-cally showing that a randomly initialized,dense RNN contains a subnetwork with a good gradient property.Therefore,this thesis argues that properly pruning an RNN can facilitate its long-term memory.Further,in LTH-related trials,this thesis found that pruned weights with a small mag-nitude often experience symbol flips during training.Based on this finding,this thesis presents a neural network pruning algorithm using the weight-flipping rate as the criterion.This thesis regards this algorithm as a regularization technique and shows experimentally that it can improve network generalization and the long-term memory ability of RNNs.
Keywords/Search Tags:Deep Learning, Recurrent Neural Networks, Long-Term Dependency, Deep State Transition, Neural Network Pruning
PDF Full Text Request
Related items