Font Size: a A A

Research On Deep Keyword Generation Method Integrating Auxiliary Information

Posted on:2022-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:H X ZhuFull Text:PDF
GTID:2568306488979829Subject:Air transportation big data project
Abstract/Summary:PDF Full Text Request
Keyphrase is a refined expression of the subject information of a document.With the help of keyphrases,the subject of text can be quickly obtained.Keyphrases are widely used in text topic mining,text classification,document retrieval and other natural language processing tasks.Existing keyphrase generation methods often focus on integrating the deep semantic information of the document,but do not make full use of the rich auxiliary information contained in the document,such as the title of the document,the organizational structure of the document,and so on.Therefore,based on the sequence-to-sequence keyphrase generation model,this article focuses on using the auxiliary information of the document to add additional constraints to the model.Specific work includes: a multi-task keyword generation method that integrates title information and a keyphrase generation method that integrates sentence structure information:Aiming at the existing keyphrase generation models that often fail to make full use of the insufficiency of the relationship between title and keyphrases,a multi-task keyphrase generation method that integrates title information is proposed.Specifically,the keyphrase generation task is the main task,and the topic generation is the auxiliary task;the agreement-based loss is designed in the objective function to strengthen the constraints on the attention mechanism between different tasks,and achieve the effect of using the key information of title.Experimental results prove that this model is better than other comparative models in both present keyphrase prediction and absent keyphrase prediction.Aiming at the characteristics of scientific and technological literature data that have a fixed organizational structure,and the existing keyphrase generation models cannot be fully utilized,a keyphrase generation method combining key sentence-level representation information is proposed.This method uses a binary classifier to classify sentences,automatically estimates whether the sentence is a key sentence,and then incorporates sentence-level information into the keyphrase generation model.Experimental results prove that the model can improve the effect of keyphrase prediction.And as the number of text sentences increases,the effect of model improvement becomes more obvious.
Keywords/Search Tags:keyphrase generation, deep learning, sequence to sequence model, attention mechanism, multi-task learning
PDF Full Text Request
Related items