Font Size: a A A

Evaluation And Analysis Of The Diversity Of Scientists Based On Text Mining

Posted on:2024-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q HuangFull Text:PDF
GTID:2557306917991819Subject:Applied statistics
Abstract/Summary:
The scientist assessment system can provide a basis for innovative science,understanding scientists and improving national research standards.The current scientist evaluation system mainly analyses the scientific research attribute indicators of scholars,i.e.evaluating scientists through their thesis output-related indicators,award-winning project indicators,etc.Although these scientific research indicators are closely related to scientist evaluation,it does not mean that scientist evaluation is a single scientific research evaluation.As China’s society and economy are currently undergoing a period of rapid change,many key issues and technical difficulties require the integration of multidisciplinary knowledge.If scientist assessment is reduced to scientific research assessment,it may hinder the training of national and scientific talent and be detrimental to the diversification of scientists.In this thesis,we take the diversity of scientists’ disciplines as the basic perspective,start from the analysis of scientists’ web pages,introduce text mining technology to classify scientists’ disciplines,and based on the classification results,we analyse the diversity of scientists’ disciplines in essence to evaluate scientists,and compare the diversity of scientists’ disciplines between countries as a whole to promote the reform of the evaluation system of scientists This is an effective way of comparing the diversity of scientists between countries as a whole and promoting the reform of the evaluation system of scientists and the development of diversity of disciplines.The research in this thesis consists of the following stages: 1)Collating the literature on scientist evaluation at home and abroad,deriving the research content,methodology and significance of this thesis,and introducing the relevant theoretical knowledge.2)Based on the global Top Scientists ranking data published by WOS up to2021,using knowledge mapping and visualisation techniques,we analyse the global distribution of scientists and their thesis output at a macro level,deriving 3)Using web crawler technology,we crawled 190,067 scientists’ web pages and used cutting-edge natural language processing algorithms to mine the scientists’ web pages,so as to construct a multi-classification model of scientists’ disciplines,and based on the results of the classification model,we derived the scientists’ potential disciplines,and analysed the scientists’ disciplinary diversity in each country The results of the classification model were used to derive the potential disciplines of scientists and to analyse and compare the disciplinary diversity of scientists in various countries.The results of the study show that: 1)Based on knowledge mapping and visual analysis methods,the number of scientists listed in the WOS ranking system is proportional to the total number of articles published by the country based on the scientists’ characteristics.2)Through text mining techniques,we crawled the web pages of the 190,000 scientists on the list,and used classification models for modelling after word separation,keyword extraction,and word vector transformation,and compared and analysed the results of classification algorithms by confusion matrix and macro-average indicators,calculated the model’s completeness,accuracy,AUC,and F1 values to judge the model’s performance,and identified the most suitable algorithms for subject multi The most suitable algorithm for multi-disciplinary classification and diversity analysis was identified.3)Based on the results of the disciplinary classification,the index weights were calculated using hierarchical analysis,and the index weights of the dimensions of richness,balance and difference were derived to quantify the disciplinary diversity of scientists,and finally the comprehensive score of the disciplinary diversity of scientists and its ranking were given.The results show that compared to the US and the UK,Israel and Norway have a smaller number of scientists on the list,but their scientists’ diversity scores are higher,which is more in line with the original purpose of this study in terms of the realistic development of science and technology.Compared to the US and the UK,the h-index and disciplinary balance are higher,but the disciplinary differences and richness are lower,which means that the scientists in China have a more balanced level of research in many disciplines and have a greater advantage in terms of the number of academic outputs.For the research content of this thesis,the main contributions are as follows: Firstly,it proposes a relevant theory led by text mining,which is able to analyse disciplinary diversity by mining the potential disciplines of scientists from web texts,innovating the idea of disciplinary diversity analysis and the significance of disciplinary classification.Secondly,it is a new attempt to change the traditional method of evaluating scientists from the perspective of disciplinary diversity of scientists,which is beneficial to the scientific and educational sectors to give opinions on the construction of disciplinary diversity and evaluation of scientists.
Keywords/Search Tags:Text Mining, Scientist evaluation Methods, Subject Classification, Diversity Indicators
Related items