The keyword extraction task is a basic natural language processing task whose purpose is to extract representative keywords or phrases from the text.This paper discusses two problems in the current sequence generation model: the traditional sequence coding model is difficult to learn the global information of the text and the current model fails to learn the important information of the title.Global information plays an important role in many unsupervised methods.Statistical-based methods use global vocabulary co-occurrence relationships to obtain text keywords.However,the current sequence generation model only relies on the text context information and discards the global information of the text.In addition,the title information contains important topic information,and the current text input method combines the title and the text to input,which greatly weakens the importance of the title in the keyword extraction task.In order to solve the two problems mentioned above,this paper proposes several solutions to improve the performance of the keyword extraction task.The main contributions of this paper include the following points:(1)Propose a sequence generation model with the fusion of graph convolutional network.By fusing the graph convolutional network and the global corpus vocabulary co-occurrence relationship,the model’s ability to capture the global information in the sequence is improved,and the traditional RNN or LSTM’s lack of global information learning ability is made up.(2)Combining the attention mechanism,propose a sequence generation model fused with graph attention network.Taking into account the different importance between words and words in the text,on the basis of the graph convolutional network,a multi-head attention mechanism is introduced to assign corresponding weights to different adjacent nodes,and pay attention to the information of the nodes with larger weights.,While ignoring the less effective node information,reducing information interference.(3)Use Bi-directional attention flow to realize interactive learning of title and text information.The title contains general topic information,but the current learning method of title text splicing input ignores the importance of the title,so the two-way attention flow module is used in the sequence coding layer to enhance the importance of the title information to the text,while achieving bidirectional information supplement.Finally,this paper conducts comparative experiments on five data sets to verify the effectiveness of the method proposed in this paper.The experimental results show that the keyword extraction model fused with Bi-directional attention flow can learn important information in the title and improve the accuracy of the keyword extraction task.On the basis of the current model,fused with graph neural network can better capture the global information of the text,improve the performance of keyword extraction,and achieve the best results on multiple data sets. |