Font Size: a A A

Research On Citation Network Evaluation Of Scientific Literature Influence And Community Detection

Posted on:2019-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:L FengFull Text:PDF
GTID:2370330593950048Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Citation Network(CN)is a special social network,which build a large-scale complex network through the reference relationship between documents,the publication of a document form a network node,the literature through the reference to other documents to form a network in the direction of the edge.Citation network contains rich content attributes such as periodicals,authors and research fields,which contains the spread of knowledge flow and information flow.With the rapid increase in the number of scientific literature,how to accurately identify valuable research literature has become an important issue.The influence evaluation of scientific literature and the topic community detection are two important research directions in citation Network,in this paper,based on the research of data mining algorithm,our study carries out the following two aspects.(1)Evaluating the scientific value of publications has been a research focus in the bibliometrics field,where some mainstream methods based on data mining overlook the influence of malicious activities and result in poor evaluation results.To solve this problem,we propose a new method named ReputeRank,which employs a creditworthiness mechanism to evaluate the effectiveness of publications in the citation network.The creditworthiness mechanism consists of three phases,the seeds selection phase,the spread credit phase and the integrated computation phase.First,ReputeRank employs background information on the division of SCI Periodicals to select potential good seeds and bad seeds in the citation network.Then,in light of assumption that good credibility seeds point to papers with a higher credible degree while bad credibility seeds point to papers with a lower credible degree,the method uses TrustRank and Anti-TrustRank evaluation formula to iteratively spread trust values and distrust values over the citation network.Finally,according to the trust and distrust values in the citation network,the method utilizes an integrated equation to comprehensively compute the score value of each paper and arranges all papers in the descending order of the score values.Our experimental results on KDD cup 2003 datasets demonstrate that ReputeRank has good performance of effectiveness and robustness compared with PageRank,count degree and SPRank.(2)The community detection of citation networks has always been a hot research topic in complex networks.The traditional research method regards citation network as static graph for community excavation,and ignores the dynamic characteristics of citation network evolution over time.In order to further improve the accuracy of the detection of citation network community,this paper presents a dynamic community detection method based on BPT model and leader-follower strategy.Firstly,the topic probability distribution matrix is generated according to the Bernoulli Process Topic model,and then the network community is divided according to the leader-follower strategy.The experimental results on CiteSeer and Cora Datasets show that the proposed method is better than other three classical algorithms in both NMI and modular degree.
Keywords/Search Tags:citation network, evaluation of academic influence, community detection, data mining, probability generation model
PDF Full Text Request
Related items