Font Size: a A A

Mention Behavior And Influence Analysis Of Algorithm Based On Full-text Content Of Academic Articles

Posted on:2021-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:R Y DingFull Text:PDF
GTID:2518306512488384Subject:Information Science
Abstract/Summary:PDF Full Text Request
Nowadays,more and more full-text databases of scientific publications are open to users.With the rapid development of natural language processing and machine learning techniques,researches of bibliometrics and evaluation based on full text have arisen the attention of scholars,such as the extraction and evaluation of knowledge entity.Algorithm is a typical kind of knowledge entity.In the era of big data,processing and analysis of dataset are inseparable from the application of algorithms.Studying the mention of algorithmic entity in academic literature of specific field and on this basis analyzing its influence can reveal the distribution of algorithmic entity in full-text academic papers and find high-influence ones.This could provide reference for relevant researchers to understand and select suitable algorithms in their study.Based on academic literature resources and natural language processing technology,this study extracts algorithmic entities from a large quantity of scientific papers and analyzes about their mention frequency,mention location,mention time and academic influence.In the aspect of extracting algorithmic entities,this study regards it as a special named entity recognition task.Firstly,an algorithmic dictionary containing of 977 algorithms is constructed by manual annotation from 4,641 ACL papers.Then,we find all the matched sentences containing the algorithmic entities which is named as algorithmic sentences and use them as labeled corpora for training models to extract algorithmic entities automatically.51,884 entities are extracted by our model and 221 kinds of new algorithmic entities are obtained after removing of 1-frequency entities and choosing manually on the dataset which deleted the labeled corpora.Finally,a total of 1,198 algorithmic entities are obtained by integrating the automatic extraction results with the manual extraction results,which are used to analyze the mention of algorithmic entity further.In the aspect of mention frequency,this study divides the mention frequency of the algorithmic entity into two indexes: the number of papers which mentioned algorithm and the total times of every mention in papers.Firstly,we get the unique article ID where the algorithm is mentioned to count the mention frequency of each algorithmic entity,and finally analyze the influence on this basis.In the aspect of mention location,this study defines the mention location of an algorithmic entity as the chapter type in a paper where algorithm is mentioned.Firstly,we extract the chapter type from the original dataset and get the algorithmic sentences including chapter type.Then take the number of papers which mentioned algorithm as the index to analyze the distribution of algorithmic entity in each location and the distribution in Method,Evaluation,Discussion and Conclusion,which named key chapters.Finally,we analyze the influence of different algorithmic entities according to the result of key chapters.In the aspect of mention time,this study defines the publication year of the paper which mentioned algorithm as the mention time.Firstly,the algorithmic entity and its corresponding article ID are extracted and on this basis we obtain the mention time of algorithm;then,the number of papers which mentioned algorithm is taken as the index to investigate the change trend of the algorithmic entity in a specific period.What's more,the high-frequency algorithms are taken as examples to analyze the change between different algorithms,and finally the influence is studied combining the time and frequency.The purpose of this study is to reveal the mention and influence of algorithmic entities in the full-text content of academic papers in a specific domain based on multiple dimensions and the results can provide some reference for the selection and use of related algorithms for relevant scientific research.
Keywords/Search Tags:Full-text content analysis, Knowledge entity, Algorithmic entity, Mention behavior, Influence analysis of algorithm
PDF Full Text Request
Related items