Font Size: a A A

A Study On Softmax Layer Of Neural Language Model

Posted on:2021-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y C ZhangFull Text:PDF
GTID:2428330626966117Subject:Engineering
Abstract/Summary:PDF Full Text Request
As a basic task in the field of Natural Language Processing(NLP),the neural language model(NLM),of which main purpose is to use the distributed representation of word to model natural language sequences,thus overcoming the dimensional disaster in statistical language model.The neural language model is widely used in many tasks such as information retrieval and question answer system.For machine translation(MT)and text generation,the neural language model is even a key component of the model mentioned above.As the output layer of the neural language model,the output of the softmax layer determines the performance of the model.Therefore,it makes sense to improve the language model and provide valuable information for the downstream tasks by studying the softmax layer.Taking the similarity between distribution of the softmax layer's prediction and the target word distribution into account only,traditional neural language models usually use the cross entropy based on the distribution of softmax layer's prediction and the target word as the only loss function.However,words sequences have their inherent character(given a sentence,there is a very small probability of arbitrarily selecting two words at different positions are the same).In order to exploit this difference explicitly,this paper constructs a cross-entropy loss function based on the context's prediction distribution of softmax layer and uses it as an additional constraint for training.Experimental results show that our novel method reduces the perplexity(PPL)of the neural language model effectively.For neural MT,we compute the prediction accuracy of softmax layer and find it is highly correlated with the mainstream evaluation metric,BLEU,to rank the quality of the machine translation models.This correlation shows that the accuracy can evaluate well the quality of neural machine translation models.This metric can provide useful information for the study of machine translation evaluation tasks.Further,this article reveals the diversity of MT.The factors that cause this diversity mainly include the increase and absence of words,the replacement of synonyms,and the difference in sentence structure.Relying on the accuracy and average probability of softmax layer's prediction at each position,this paper verifies the significance of translation diversity in the view of data-level.What's more,we find that the softmax layer's prediction accuracy does not increase with the increasing length of the input by the decoder.
Keywords/Search Tags:neural language model, softmax, context's diversity, machine translation evaluation, the diversity of MT
PDF Full Text Request
Related items