Font Size: a A A

Research On Semantic Reinforcement Based On Topic And Word Features For RNN Language Model

Posted on:2020-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:W LiFull Text:PDF
GTID:2428330590963052Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The language model is one of the basic tasks in the field of natural language processing.It is also applied to many text processing tasks such as speech recognition,dialogue system,machine translation,and syntax analysis.With the development of neural networks,the language model has gradually evolved from the traditional N-gram language model to the current language model based on recurrent neural networks.In the recurrent neural network language model,the input shares the same loop structure,so the historical information in the sequence can be recorded in the networks,but the long-distance dependence problem arises,that is,the long-distance semantic information is updated and covered in the networks.It leads to gradually forgetting the important global semantic information in the networks.This paper focuses on the weak semantic expression of RNN language model,and carries out semantic reinforcement from the perspectives of topic features and word features.The main work includes the following aspects:(1)A semantic reinforcement method TE-RNN based on topic representation vector is proposed.The LDA topic model and the pre-trained word vector model are combined with the position information of the words to optimize the calculation method of the topic representation vector.It can provide an accurate subject semantic representation for each word in the input sequence;(2)A semantic reinforcement method TA-RNN based on topic attention distribution is proposed.Different from TE-RNN,the topic representation vector is directly spliced into the input of networks.TA-RNN dynamically assigns the local topic and global topic to each word by constructing the topic attention at the sentence level and the paragraph level.It can provide accurate topic semantic assignment for each word in the input sequence;(3)A semantic reinforcement method WD-RNN based on term weight feature is proposed.Since the stop words in the actual corpus account for the majority(about70%),the network is full of “poor semantics” word information.This model uses the weight feature of words to construct a weighted dropout layer,suppresses theinput of information with lower weights in the network input stage,and enhances the vitality of words with higher weights in the output stage,so that the information of words with higher weight can be better remembered and transmiting in the network.
Keywords/Search Tags:RNN language model, topic model, semantic reinforcement, attention, term weight
PDF Full Text Request
Related items