A Study On Softmax Layer Of Neural Language Model

Posted on:2021-01-15

Degree:Master

Type:Thesis

Country:China

Candidate:Y C Zhang

Full Text:PDF

GTID:2428330626966117

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

As a basic task in the field of Natural Language Processing(NLP),the neural language model(NLM),of which main purpose is to use the distributed representation of word to model natural language sequences,thus overcoming the dimensional disaster in statistical language model.The neural language model is widely used in many tasks such as information retrieval and question answer system.For machine translation(MT)and text generation,the neural language model is even a key component of the model mentioned above.As the output layer of the neural language model,the output of the softmax layer determines the performance of the model.Therefore,it makes sense to improve the language model and provide valuable information for the downstream tasks by studying the softmax layer.Taking the similarity between distribution of the softmax layer's prediction and the target word distribution into account only,traditional neural language models usually use the cross entropy based on the distribution of softmax layer's prediction and the target word as the only loss function.However,words sequences have their inherent character(given a sentence,there is a very small probability of arbitrarily selecting two words at different positions are the same).In order to exploit this difference explicitly,this paper constructs a cross-entropy loss function based on the context's prediction distribution of softmax layer and uses it as an additional constraint for training.Experimental results show that our novel method reduces the perplexity(PPL)of the neural language model effectively.For neural MT,we compute the prediction accuracy of softmax layer and find it is highly correlated with the mainstream evaluation metric,BLEU,to rank the quality of the machine translation models.This correlation shows that the accuracy can evaluate well the quality of neural machine translation models.This metric can provide useful information for the study of machine translation evaluation tasks.Further,this article reveals the diversity of MT.The factors that cause this diversity mainly include the increase and absence of words,the replacement of synonyms,and the difference in sentence structure.Relying on the accuracy and average probability of softmax layer's prediction at each position,this paper verifies the significance of translation diversity in the view of data-level.What's more,we find that the softmax layer's prediction accuracy does not increase with the increasing length of the input by the decoder.

Keywords/Search Tags:

neural language model, softmax, context's diversity, machine translation evaluation, the diversity of MT

PDF Full Text Request

Related items

1	Research On Improving Translation Diversity In Back-Translation
2	Neural Machine Translation Research On Fusing Multilingual Encoded Information
3	Exploitation And Application On The Implicit Learning Ability Of Neural Machine Translation Model
4	Research On Interactive Machine Translation
5	Research On Context Representation Methods For Machine Translation
6	Parallel Sequence Decoding In Neural Machine Translation
7	Research On Statistical Method Of Machine Translation Evaluation
8	Study On Quality Evaluation Of English-Chinese Machine Translation
9	Research On Quality Evaluation Of English-Chinese Artificial Translation Based On Language Model
10	A Stastical Machine Translation System Between Mongolian And Chinese