Font Size: a A A

Minimal Gated Unit For Recurrent Neural Networks

Posted on:2017-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:G B ZhouFull Text:PDF
GTID:2308330485966385Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Countless learning tasks require dealing with sequential data. For some problem, it requires that a model produces outputs that are sequences. In other domains, a model must learn from inputs that are sequences. Interactive tasks often demand both capabilities. Compared to the traditional models, recurrent neural networks can be better qualified for the three categories of issues mentioned above. In practice, recurrent neural network have been successfully applied to many areas, such as speech recognition, video motion analysis, handwriting recognition and image captioning, achieved good results. After years of development, the recurrent neural network spawned a lot of variants, LSTM and GRU are the most widely used structures and both of them are gated units. Benefitting from evaluation results on LSTM and GRU in the literature, we propose a gated unit for RNN, named as the Minimal Gated Unit (MGU), since it only contains one gate, which is a minimal design among all gated hidden units. Compared to the previous variants, MGU’s contribution are:First, MGU minimize the number of gates in the structure, so it has much less parameters than LSTM and GRU. Its training complexity and training speed also benefit from this property. In some experiments, MGU is much more faster than GRU. We can get a good result within acceptable time cost, but GRU can’t.Second, its simple architecture also means that it is easier to evaluate and tune, and in principle it is easier to study MGU’s properties theoretically and empirically.Third, we have evaluated the effectiveness of MGU in four problems (the adding problem, sentiment analysis, image identification and language model), and MGU has achieved comparable accuracy with GRU when the input sequence length ranges short (35,50-55), moderate (128), and long (784).
Keywords/Search Tags:RNN, Machine Learning, LSTM, GRU
PDF Full Text Request
Related items