Font Size: a A A

Research On Graph-based Text Keyword Extraction Integrating Deep Learning

Posted on:2022-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:C X ChenFull Text:PDF
GTID:2518306524480394Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,more and more texts appear on the Internet.People urgently need to obtain the main content of the text through text keywords quickly to determine whether the text is of interest to them.At the same time,keyword extraction is a basic research in the field of natural language processing(NLP),and the effect of keyword extraction can directly affect the effect of many downstream tasks.Therefore,keyword extraction has received extensive attention and research.Graph-based keyword extraction algorithms have been widely studied due to they can measure the importance of words through the relationship between words,and they are always unsupervised.However,this type of method mostly uses symmetrical relationships when constructing text graph,such as co-occurrence relationships.These relationships not only do not start from the meaning of keywords,but also ignore the differences of the relationship between words.In recent years,with the development of deep neural network,some researchers have introduced neural network into keyword extraction and achieved certain results.However,deep learning-based models are often supervised and require a lot of data for training.At the same time,these models are often not interpretable.In order to give full play to the advantages of the two algorithms and overcome the shortcomings of the two algorithms,this paper combines the two algorithms and conducts a deep study.The main work of this thesis are as follows:1.Based on the definition of text keywords,the word relevance is proposed,and the calculation method of the word relevance is given.At the same time,because of the superiority of word vectors in representing words,the projection of word vectors is proposed to express the word relevance.2.A word relevance-based text keyword extraction algorithm is proposed,which uses word relevance to construct a directed text graph,and the directed graph is used to achieve text keyword extraction.Experiments and comparisons on the public data sets prove the superiority of the word relevance degree-based keyword extraction algorithm.3.In view of the shortcomings of existing deep learning-based text keyword extraction algorithms which require a large amount of training data and are not interpretable,this thesis introduces attention mechanism into text keywords extraction,and a text keyword extraction based on NSelf-Attention is proposed.At the same time,in order to train the NSelf-Attention model,this theis proposes a mask language model based on NSelf-Attention.4.This thesis conducts comparative experiments on related data sets and experimental results prove the superiority of the text keyword algorithm based on NSelfAttention proposed in this thesis.
Keywords/Search Tags:graph-based keywords extraction, deep learning, word relevance degree, NSelf-Attention
PDF Full Text Request
Related items