Research On Improving Generalization Ability Based On LSTM Network

Posted on:2022-03-17

Degree:Master

Type:Thesis

Country:China

Candidate:X Chen

Full Text:PDF

GTID:2518306737954139

Subject:IC Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,neural network technology has developed rapidly and is gradually applied in some smart products.In many cases,the neural network needs to be truly implemented in order to play its value.However,the actual use scenarios are complex and changeable,which is no small challenge for neural networks.In order to meet such needs,the generalization ability of neural networks must be strong enough to be able to adapt to real application scenarios.Because of the practical significance of improving the generalization ability of neural networks,it has now become a concern of many researchers.From the perspective of model and data,this paper launches the research on the generalization ability of neural network.In order to improve the generalization ability of neural networks,this paper proposes a method based on multi-head attention and a method of multi-domain data augmentation.The main research contents of this paper are as follows:(1)A method to improve the generalization of neural networks based on multihead attention is proposed.This method first selects the LSTM that is more related to the input task from multiple LSTM network structures according to the multi-head attention mechanism,and then uses the Mask matrix to selectively activate it according to the attention score.The activated LSTM can read the information of other LSTMs and complete information exchange.In this process,the information related to the task is retained,and the universal features in the task are extracted,and the neural network has stronger generalization performance.In an experiment comparing traditional parallel LSTM,the average test error of this method on the four data sets is about 1.39 %lower than that of the traditional method.In addition,the experiment compared related studies.The average test error of this method in the four data sets is about 0.21 % lower than that of the sub-optimal algorithm,and the average test error under the condition of noise is also about 0.73 % lower than that of the sub-optimal algorithm.Theoretical analysis and experiments show that this method can effectively improve the generalization ability of neural networks.(2)Among the data augmentation methods related to speech recognition,most of the conventional data augmentation methods are used.Conventional methods usually carry out data augmentation from the time domain,but the frequency domain research of voice data is more important.Therefore,this paper proposes a method of multifrequency data augmentation.First,four methods of time-domain data augmentation are used for the data set,and then three methods of spectrum noise and spectrum masking are performed.This method changes the structure and distribution of the data,which is beneficial for the neural network to learn general characteristics,thereby improving the generalization ability of the neural network.Experiments show that the generalization error of this method is about 2.76 % lower than the average value of other methods,which significantly improves the generalization ability of the neural network.

Keywords/Search Tags:

Neural network, Generalization, Multi-head attention, Data augmentation

PDF Full Text Request

Related items

1	Methods For Improving Generalization Of Convolutional Neural Networks On Image Classification And Detection
2	Research On Chinese Machine Reading Comprehension Based On Neural Network And Multi-head Self-attention Mechanism
3	Research On Hybrid Video Recommendation Algorithm Based On Multi-Head Self Attention Mechanism
4	Research And Application Of Automatic Augmentation Of Text Data Based On Neural Network Architecture Search Ideology
5	Speech Emotion Recognition Based On Neural Network And Attention Mechanism
6	Research On Single Channel Speech Enhancement Based On Multi-head Attention Mechanism
7	Object Detection Using Deep Convolutional Neural Network
8	Research On Sentiment Analysis Based On Text Data Augmentation And Hybrid Model
9	Research On News Text Classification Based On Multi-head Attention Mechanism And Feature Fusion
10	On Generalization Of Multi-Layer Feedforward Neural Network And Its Applications