Font Size: a A A

Identification Optimization Of Research Fronts Based On Co-Citation-Coupling Network And LDA Model

Posted on:2021-05-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:J L LiFull Text:PDF
GTID:1488306521463164Subject:Information Science
Abstract/Summary:PDF Full Text Request
Research Front is the latest,most forward-looking and leading research direction in scientific research.As a breakthrough and new growth point of scientific and technological innovation,it has attracted much attention,especially since the full implementation of the national innovation driven development strategy outline.Facing the new trend of scientific and technological innovation,it is of great significance to study how to identify the research fronts as early as possible and accurately,and predict the future direction and key points so as to better serve the macro-decision of national scientific and technological development,reasonably allocate scientific and technological resources,and help scientific researchers to grasp the scientific research trends in time.At present,there are three main methods to identify research fronts:citation-based,content-based and their combination.The citation-content combination method based on citation analysis and content analysis is relatively the hot spot.The more common method is to combine the co-citation analysis,citation coupling and text content analysis.However,the combination mainly focuses on the layers of literature clustering and cluster description,and there is no extension in the clustering basis.Therefore,the "prospective value" and "academic accuracy" of identifying research fronts are often questioned.In view of this,in order to better match the "research frontier" in the professional vision of scientists,this paper proposed “Identification of Research Fronts based on Cocitation-coupling & LDA” to optimize the citation-content combination method from two levels of recognition information domain(clustering basis)and recognition semantic depth.Firstly,this paper attempted to construct a research frontier information domain with more novel,higher academic relevance and better coverage based on a variety of academic citation relations.On this basis,LDA topic model was used to directly identify the research fronts from the semantic level of text content.The main objective of this method was to comprehensively expand the information domain and semantic depth of the citation-content combination method,and to improve the foresight and academic accuracy of the research front detection.In this paper,various research methods were adopted,including literature research,inductive analysis,bibliometrics,text mining,empirical research,expert evaluation,comparative research method,etc.This paper focused on two research contents: one was to study the theories and methods related to research front identification;the other was to study the design,analysis experiment and effect evaluation of “Identification of Research Fronts based on Co-citation-coupling & LDA”.Through theoretical and empirical research,the main results are as follows:(1)Theoretical research: Based on the in-depth analysis of the development and evolution of research fronts,the connotation and characteristics of research fronts were summarized and defined: research front is a group of latest research topics with high activity and academic attention.Activity,novelty and attention are its most basic characteristics.(2)Method research: Including the construction of research front information domain,topic extraction based on phrase LDA,and research front identification based on "citation-content" features.First of all,the research put forward the principles,ideas and processes of constructing a new information domain of research fronts with better recognition value from the perspective of academic reference and using a variety of academic reference relations.secondly,the research designed the specific methods and processes of topic extraction using the phrase LDA model,including the construction of corpus before extraction,parameter setting in the extraction process and other key links.Thirdly,based on the basic characteristics of "activity"," novelty" and " attention",the composite identification index—Research Fronts Identification Metrics(RFIM),was designed and constructed,which integrated the literature topic and citation characteristics.(3)Empirical research: The immunology field was selected to carry out empirical research on the new research front identification method,and finally identified 26 immunology research fronts according to RFIM and expert evaluation.The accuracy of front identification reached 86.7%.The mean value of average publication year of front topics was 2017.6,and the novelty was greatly improved compared with the classical co-citation analysis method(2016.3).The main conclusions and innovations of this paper were as follows: Firstly,it proposed “Identification of Research Fronts based on Co-citation-coupling & LDA” to realize the deep combination of citation and content,effectively expand and deepen the clustering basis and semantic depth,which could improve the foresight of the recognition frontier and the accuracy of content semantic recognition.Secondly,it designed and constructed a composite identification index—Research Fronts Identification Metrics(RFIM),which integrated the characteristics of literature citation and content.Under the framework of topic model,the index quantified the basic characteristics of the research front,including three sub-indexes: "thematic activity","thematic novelty" and "thematic impact".Compared with a single index,RFIM combined with each sub-index could not only comprehensively reflect the frontier degree of the research topic,but also reflect the specific performance of the topic in terms of activity,novelty and attention,which was conducive to improving the accuracy of Frontier identification.“Identification of Research Fronts based Co-citation-coupling & LDA” proposed in this paper could directly and efficiently process large-scale academic corpus by combining the front detection process with the specific presentation process,which provided a new idea and method reference for the scientific exploration research front in the era of big data.It not only optimized the relevant information analysis methods in methodology,but also provided a new perspective to grasp the dynamic of science and technology frontier accurately in practice.27 diagrams,21 tables,4 appendices are included.
Keywords/Search Tags:Research front identification, Research front information domain, ESI research fronts, Co-citation-coupling network, LDA topic model, PhraseLDA model, Research Front Identification Metrics
PDF Full Text Request
Related items