Research On Citation Prediction Of Academic Papers Base On XLNet And GAT

Posted on:2022-03-13

Degree:Master

Type:Thesis

Country:China

Candidate:X F Chen

Full Text:PDF

GTID:2518306338489714

Subject:Control Engineering

Abstract/Summary:

PDF Full Text Request

For the past few years,along with the fast development of science and technology,major breakthroughs have been made in many areas of scientific research.Scholars have written their research results into papers,It’s helpful to their subsequent scholars to get theoretical support and technical guarantees.The citations of papers show the influence of a scholar in their research field.Predicting the citations of scholars’ papers can not only help researchers quickly identify influential scholars in the field,but also help scientific research management departments and funding agencies understand the subject Development trends,determine funding areas and topics,and better allocate resources.At the same time,the advent of the Internet era has made the electronicization of papers more common,which also allows us to obtain the citations of academic papers and historical papers published by scholars in recent years by crawling to conduct research on the citation forecast of scholars.At present,domestic and foreign researches on the Reference volume prediction of academic papers are mainly divided into statistical analysis methods,machine learning methods and graph model-based methods.Famous scholars usually have co-authoring relationships,so the co-authoring and citation relationship of scholar’s papers is very helpful for predicting the citation volume of scholar’s papers.However,Methods based on statistics and machine learning methods cannot make full use of the co-authoring and citation relationships of scholars’ papers,but simply treat each scholar as an isolated individual;The method based on the graph model only uses the relationship diagram of the paper,and does not combine the natural language processing technology to extract the feature of the text content of the paper,which leads to the failure to fully show the scholar’s research field and research content,However,this is an important feature to predict the number of citations by scholars,because the number of citations by scholars active in the hot research fields in recent years is usually higher.The graph neural network that has emerged in recent years is an effective algorithm for describing graph relationships.After constructing the adjacency matrix,the graph neural network can use the adjacency matrix to propagate the features between nodes,thereby completing the semi-supervised learning on the graph.On the other hand,since the title of the paper belongs to the text,natural language processing related technologies are needed to extract text features.In recent years,pretraining models including BERT,ELMo,GPT series and XLNet have made breakthrough progress in natural language processing related tasks.Among them,XLNet as an autoregressive language model overcomes the shortcomings of the autoencoding model and solves In order to overcome the inability of other self-encoding models to obtain context,it has achieved very good results in multiple tasks in the natural language field.In this paper,we try to use XLNet to extract the feature of the title of the paper,and to splice the historical information of scholars as the characteristics of the scholars for graph neural network training.This paper has fully studied the related work of domestic and foreign scholars’ paper citation prediction,and the current deficiencies of the research have been analyzed and summarized.Aiming at the characteristics of the academic paper citation task,this paper has proposed the scholar paper citation prediction algorithm XLNet＿GAT based on the pretraining model XLNet and the graph attention network GAT,the improved Word Char＿XLNet＿GAT algorithm based on word segmentation and multi-feature fusion of word segmentation,and self-attention Improved Self＿Att＿XLNet＿GAT algorithm of force mechanism,WC＿Att＿XLNet＿GAT improved algorithm based on word segmentation multifeature fusion and self-attention mechanism,respectively.The main work of this paper contains the following three points:(1)The composition of Baidu’s academic paper summary page has been analyzed,by employing crawlers to grab the summary of Chinese papers in the field of artificial intelligence for the past five years,including the author list,the title,and the citation status of the paper.Based on the above results,the experimental corpus for this paper research has been obtained.(2)The shortcomings of the existing methods for predicting the citations of scholars at home and abroad have been analyzed.In addition,the citation prediction algorithm XLNet＿GAT has been and proposed,which combines the pre-training model XLNet and the graph attention mechanism GAT.This method constructs a directed graph as an adjacency matrix through the co-authoring and citation of the paper and combines XLNet to extract the text feature of the paper title.Experiments show that the RMSE of the XLNet＿GAT algorithm on the test set is about 10.8% lower than that of the XLNet＿Bi LSTM algorithm,and the R2＿Score is increased13%.(3)Based on XLNet＿GAT,and combining the XLNet features of word segmentation and word segmentation,this paper proposes an improved Word＿Char＿XLNet＿GAT algorithm based on the fusion of multi-features of word segmentation and word segmentation.Simultaneously,the self-attention mechanism has been adopted to fuse the XLNet features of multiple paper titles by the same scholar,and an improved Self＿Att＿XLNet＿GAT algorithm based on the self-attention mechanism has been proposed.Combining the advantages of the improved Word＿Char＿XLNet＿GAT algorithm and the improved Self＿Att＿XLNet＿GAT algorithm,this paper proposes an improved WC＿Att＿XLNet＿GAT algorithm based on word segmentation multi-feature fusion and selfattention mechanism.Finally,the effectiveness of the improved three algorithms has been proved by an ablation experiment.

Keywords/Search Tags:

XLNet, Paper Citation Forecast, Graph Neural Network, GAT, Self-attention Mechanism

PDF Full Text Request

Related items

1	Research On Ranking Scientific Publications Based On Citation Graph
2	Research On Graph Neural Network Recommendation Algorithm With Attention Mechanism
3	Research And Application Of Recommendation Algorithm Based On Graph Neural Network
4	Research On Knowledge Graph Citation Recommendation Combined With BERT And Graph Attention Networks
5	Research On Recommendation Algorithm Based On Graph Neural Network
6	Research On Group Recommendation Algorithm Based On Attention Mechanism And Graph Neural Network
7	Research On Audit Text Classification Based On XLNet
8	Design And Implementation Of Conversational Recommender System Based On Graph Embedding And Attention Mechanism
9	Research On Network Aglinment Based On Attention Mechanism And Graph Neural Network
10	Housing Price Forecast With Attention Recurrent Neural Network