Research On Semantic Reinforcement Based On Topic And Word Features For RNN Language Model

Posted on:2020-08-17

Degree:Master

Type:Thesis

Country:China

Candidate:W Li

Full Text:PDF

GTID:2428330590963052

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The language model is one of the basic tasks in the field of natural language processing.It is also applied to many text processing tasks such as speech recognition,dialogue system,machine translation,and syntax analysis.With the development of neural networks,the language model has gradually evolved from the traditional N-gram language model to the current language model based on recurrent neural networks.In the recurrent neural network language model,the input shares the same loop structure,so the historical information in the sequence can be recorded in the networks,but the long-distance dependence problem arises,that is,the long-distance semantic information is updated and covered in the networks.It leads to gradually forgetting the important global semantic information in the networks.This paper focuses on the weak semantic expression of RNN language model,and carries out semantic reinforcement from the perspectives of topic features and word features.The main work includes the following aspects:(1)A semantic reinforcement method TE-RNN based on topic representation vector is proposed.The LDA topic model and the pre-trained word vector model are combined with the position information of the words to optimize the calculation method of the topic representation vector.It can provide an accurate subject semantic representation for each word in the input sequence;(2)A semantic reinforcement method TA-RNN based on topic attention distribution is proposed.Different from TE-RNN,the topic representation vector is directly spliced into the input of networks.TA-RNN dynamically assigns the local topic and global topic to each word by constructing the topic attention at the sentence level and the paragraph level.It can provide accurate topic semantic assignment for each word in the input sequence;(3)A semantic reinforcement method WD-RNN based on term weight feature is proposed.Since the stop words in the actual corpus account for the majority(about70%),the network is full of “poor semantics” word information.This model uses the weight feature of words to construct a weighted dropout layer,suppresses theinput of information with lower weights in the network input stage,and enhances the vitality of words with higher weights in the output stage,so that the information of words with higher weight can be better remembered and transmiting in the network.

Keywords/Search Tags:

RNN language model, topic model, semantic reinforcement, attention, term weight

PDF Full Text Request

Related items

1	Research On Microblog Topic Recognition Based On Neuro-semantic Topic
2	Research On Topic Modeling Method Based On Semantic Distribution Similarity
3	Research On Text Semantic Understanding Methods Based On Topic Model And Attention Mechanism
4	Research And Application On Automatic Summary Algorithm Based On Multiple Models
5	Topic Extraction Algorithm Based On NP-Chunking And Phrase Weight Calculation
6	Research On Patent Classifiction Method Intergrating Thematic Expression
7	Research And Application Of Topic Model For Short Texts Based On Part-of-Speech Feature And Semantic Enhancement
8	Text Segmentation Methods Based On Semantic Topic Guidance And Data Augmentation Training
9	The Research On Chinese Sentential Semantic Model Parsing And Text Representation
10	Research On Deep Image Captioning Technology With Semantic Guidance