| Not all science works are similar to those with Nobel Prizes or published in Nature or Science characterized by considerable achievements and influence.A reasonable quantitative analysis of the remaining research works,which is often related to the formulation and modification of research policies,evaluation rules,etc.,has been a long-standing research topic.Most research entities have two main public behaviors,which are publication and citation.The information contained in the publication is relatively limited.As a one-way expression of the author’s own research content,it can only show his/her study and cooperative relationship in the corresponding period.Citation,as the communication between research works,contains more information.In addition to expressing the research content,it also reflects the value of the research in a longer time and other rich content such as related knowledge network.Therefore,a large number of quantitative analyses of scientific research achievements have gradually shifted to citation-based methods.In practice,research entity ambiguity together with nonstandard counting methods has brought great obstacles to quantitative analysis.The entity ambiguity and different counting methods often bring various evaluation and ranking results.How to accurately find research entities and choose reasonable counting methods is an important prerequisite for the reliability of quantitative analysis.The citation of a scientific research work can be roughly divided into two parts.One is the citation from the author himself/herself,that is,self-citation.One is the citation from others,that is,non-self citation.Although an author may have plenty of reasons to cite his/her own work,the importance and impact brought by these citations are still full of controversies.There are few self-citation studies carried out at the macro level of country,and the existing relevant studies are basically limited to one or a few countries,as well as simple data statistics and phenomenon analysis in one or a few specific fields.There is a lack of comprehensive research and in-depth quantitative analysis of the self-citation situation at this level.This study is based on the Web of Science(WOS)database.After cleaning and structuring the original data,the research entities at institution level are accurately identified through a rule-based disambiguation algorithm,and the rankings of different counting methods based on strict classification of mathematical properties are analyzed at the corresponding level.Finally,the first address of the first author in a paper is taken as the source of the article.The international citation ratios of major countries(i.e.,non-self citation ratio)are calculated,and corresponding modeling analysis has been carried out to explain and verify the abnormal declining trend of China.The main research contents and findings are as follows.(1)A large number of research institutions often have multiple naming forms in the literature,mainly in the form of abbreviations of names.This phenomenon of one institution corresponding to multiple names is the name ambiguity at the institution level,which brings difficulties to the subsequent statistical analysis.In this paper,I’ve used abbreviated author names to screen potentially identical institutions,and a rule-based algorithm to carry out multi-dimensional evaluation of the textual similarity and inclusion relationship of institution names.Finally,multi-level geographic information in institution addresses has been integrated to disambiguate them and disambiguated institution name cross-reference tables in both mathematics and computer and information science domains of the Wo S database have been obtained.(2)There are many counting methods for quantitative analysis of science work.Based on institution name disambiguation,this paper has carried out Complete Counting(CC),Complete-fractionalized Counting(Cf C),Straight Counting with first author(SCf),Straight Counting with reprint author(SCr),Whole Counting(WC)and Whole-fractionalized Counting(Wf C)for papers in the corresponding two fields at institutional level.The correlation analysis of the rankings of the number of publications and citations generated by six counting methods strictly defined and classified by mathematical properties was conducted and classified into three categories by Spearman’s correlation coefficient and hierarchical clustering,where Cf C,SCf,SCr in the top 30,50 and 100 institutions all showed relatively high correlation and were in one category,WC and Wf C another category,and CC was in a separate category.(3)According to the analysis results of the counting methods,the SCf method is selected to count the parameters at country level.Taking the first address of the first author as the document address,the ratios of international citations in major countries from 2010 to 2016 have been calculated.The statistical results show that the international citation ratio of China shows an obvious downward trend year by year compared to others.The null model based on random citation analysis has found that at the national level,the higher the growth rate of citations issued each year,the greater the downward pressure on the ratio of international citations there would be.On the contrary,it is easier to show an upward trend.This conclusion has been well verified in the empirical data.Then,this paper simplifies the citation process through the "picking ball model",and qualitatively explains that the reason of this phenomenon lies in the change of volume proportions of different countries in the total amount of global science work.After simplifying the citation amount as the publication amount,the main conclusion is still valid.(4)For a large number of other factors not included in the random citation model,this paper calculates the deviation between the actual citation probability and expectation of foreign citation to Chinese literature via Z-scores,and the results have showed a slight upward and downward trend in foreign citation preference for China between 2010 and 2016 when the effect of disciplinary volume is eliminated and not eliminated respectively.This partly reflects the continuous decline in the actual citation probability of China receiving citations from other countries in active and emerging disciplines compared to the expectation brought by Chinese publications. |