Font Size: a A A

Quantifying Long-term Scientific Impact

Posted on:2019-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:F LiFull Text:PDF
GTID:2428330566980001Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Paper is the cornerstone for researchers to carry out their work.How to quickly find the desired paper is a difficult task.There are two reasons for the difficulty of finding desired paper: First,the dynamic growth of scientific citation network size;second,the inability of traditional quantitative analysis indicators to predict the long-term scientific impact.The era of big data has come.The citation network and its influence have been recorded in the form of data.The main form is a database of documents.In the face of such a huge literature database,how to quickly and accurately quantify the long-term scientific impact has become a challenging and meaningful issue.There are many traditional quantitative indicators,and some of the most commonly used are the impact factor(IF)and H index,as well as variants of both.Impact factors are for journals.In general,the higher the impact factor,the greater the influence of journals,but the impact factor has a fatal problem: the principle of equivalence of citations.Different research areas have significant differences in impact factors.For example,due to the large number of journals in the field of biology,the overall highest cited frequency leads to an overall higher impact factor in the biological field than that in other disciplines.In the present,the trend of impact factors is supreme.This has led to the development of other disciplines that have been deterred by the limitations of the impact factor.The H index is for scholars.Compared with the impact factor,the H index takes into account the number of citations and the number of articles cited by scholars.Overall,the H index can more objectively evaluate the level of scholars.However,apart from the factors related to the coverage of the H index and the literature database,the H index has another major flaw: The H index does not distinguish the actual contribution of the author.For example,there are two authors whose H index are both 30.On the surface,the level of scientific research between the two scholars is quite similar.However,in these 30 articles,the former scholar has 25 first-author papers,and the latter scholar has only the fifth of the articles,it is clear that the former should have greater research contributions.In view of the shortcomings of the traditional indicators,some scholars have put forward some new ideas for quantifying the long-term influence of the literature in recent years.Among them,there are representative researches.One of the methods is the graph theory.A typical example is PageRank.The method and its variants are used in science citation networks.The second is based on the complex network,through the three mechanisms of the scientific citation network(attenuation mechanism,aging,and fitness)to reveal the mechanism of the long-term impact of scientific.However,these methods are faced with such problems as the loss of scientific citation network structure information and the computational complexity of the model complexity.In this essay,in order to quantify the long-term scientific impact,the analysis and processing of the huge scientific citation network database(APS database,ArnetMiner database,and Microsoft Academic Research database)have fully contrasted the classical quantitative analysis models.The feature-pair approach uses structural information from a scientific citation network,and we use the stationary time series model to analyze scientific citation networks,and make predictions on the volume of KDD messages.The contributions of this article are as follows:1)Based on the three mechanisms we mentioned above,the aging mechanism is improved,and the attenuation mechanism of the scientific citation network is proposed to analyze the nodes of the citation network.The scientific citation network is a scale-free network.We use the rate equation method to solve the degree distribution in the scientific citation network.2)In order to make use of the text structure information of the paper in the scientific citation network,we propose the IIRank model.The model consists of two parts.The first is the use of the feature-pair method to model the literature,and the similarity to the model of the burst mechanism is used to measure the freshness of the paper.Second,the entropy calculation is added to the model,and the importance of the article is evaluated by the entropy of the feature pair.3)Through a lot of experimental analysis,we find that the scientific citation network is not only a scale-free network,but also a citation sequence.As a citation sequence,it meets the characteristics of stationarity and non-randomness.These two features also make it possible for us to analyze the network of citations using a stationary time series.We use a series of statistical indicator to analyze the citation sequences and select the appropriate stationary sequence model based on the relevant indicator information.Through the model,we predict the cited amount of KDD conference journal,so as to achieve the purpose of quantifying the long-term influence of the literature.The possibility of applying temporal models to scientific citation sequences is confirmed.
Keywords/Search Tags:Long-term scientific impact, Complex network, Mechanism model, Time series model
PDF Full Text Request
Related items