Font Size: a A A

Research On Improving Generalization Ability Based On LSTM Network

Posted on:2022-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:X ChenFull Text:PDF
GTID:2518306737954139Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
In recent years,neural network technology has developed rapidly and is gradually applied in some smart products.In many cases,the neural network needs to be truly implemented in order to play its value.However,the actual use scenarios are complex and changeable,which is no small challenge for neural networks.In order to meet such needs,the generalization ability of neural networks must be strong enough to be able to adapt to real application scenarios.Because of the practical significance of improving the generalization ability of neural networks,it has now become a concern of many researchers.From the perspective of model and data,this paper launches the research on the generalization ability of neural network.In order to improve the generalization ability of neural networks,this paper proposes a method based on multi-head attention and a method of multi-domain data augmentation.The main research contents of this paper are as follows:(1)A method to improve the generalization of neural networks based on multihead attention is proposed.This method first selects the LSTM that is more related to the input task from multiple LSTM network structures according to the multi-head attention mechanism,and then uses the Mask matrix to selectively activate it according to the attention score.The activated LSTM can read the information of other LSTMs and complete information exchange.In this process,the information related to the task is retained,and the universal features in the task are extracted,and the neural network has stronger generalization performance.In an experiment comparing traditional parallel LSTM,the average test error of this method on the four data sets is about 1.39 %lower than that of the traditional method.In addition,the experiment compared related studies.The average test error of this method in the four data sets is about 0.21 % lower than that of the sub-optimal algorithm,and the average test error under the condition of noise is also about 0.73 % lower than that of the sub-optimal algorithm.Theoretical analysis and experiments show that this method can effectively improve the generalization ability of neural networks.(2)Among the data augmentation methods related to speech recognition,most of the conventional data augmentation methods are used.Conventional methods usually carry out data augmentation from the time domain,but the frequency domain research of voice data is more important.Therefore,this paper proposes a method of multifrequency data augmentation.First,four methods of time-domain data augmentation are used for the data set,and then three methods of spectrum noise and spectrum masking are performed.This method changes the structure and distribution of the data,which is beneficial for the neural network to learn general characteristics,thereby improving the generalization ability of the neural network.Experiments show that the generalization error of this method is about 2.76 % lower than the average value of other methods,which significantly improves the generalization ability of the neural network.
Keywords/Search Tags:Neural network, Generalization, Multi-head attention, Data augmentation
PDF Full Text Request
Related items