Font Size: a A A

Research And Application On Tensorized Language Model

Posted on:2021-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:X D MaFull Text:PDF
GTID:2518306548485834Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The development of language model is important for natural language processing.Deep neural network has provided a powerful impetus for language model.At present,There are various language models based on neural network.However,there are some problems in the research process of neural language models,which need to be studied and solved.This paper mainly study two question:(1)Interpretable mechanism of neural language model based on separation rank(2)Neural language model compression based on tensor low-rank decompositionIn order to study the interpretable mechanism of neural language model,separation rank in the language modeling process is defined,which be used to measure the degree of contextual dependencies in a sentence.Then,the lower bound of the separation rank can reveal the quantitative relation between the network structure and the modeling ability for the contextual dependency.After that,an adaptive recurrent network based on the separation rank to model contextual dependency is proposed.On various NLP tasks,we verify the theoretical analysis.On the sentence classification task,the experimental results show that the adaptive recurrent network can achieve better results than the traditional bidirectional LSTM.Facing the difficulty of training the large size of pre-trainning language models(GPT1,GPT2 and BERT,etc.),this paper uses tensor low-rank decomposition technology combined with the idea of parameter sharing to compress the popular Transformer language model,and achieve 8-times the parameter compression in the multi-head attention layer.Our model is tested on three language model modeling datasets,and achieves comparable result to the best models with half parameters of original transformer.On English-German machine translation task,the method achieves one time parameters compression while ensuring the translation quality of the model.
Keywords/Search Tags:Language Model, Separation Rank, Tensor Decomposition, Tensor Space, Neural Network
PDF Full Text Request
Related items