Font Size: a A A

A Comparative Study On Semantic Representation Of Keywords In Science And Technology Bibliometric Analysis

Posted on:2022-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:P T WangFull Text:PDF
GTID:2518306761984289Subject:Scientific Research Management
Abstract/Summary:PDF Full Text Request
The semantic representation of scientific literature keywords determines the reliability of the results of keywords analysis in scientific literature.Although there is a lot of research on semantic representation of words,in the scenario of keywords analysis,there is lack of effective research to systematically evaluate the quantity of semantic representation methods of keywords.So that,in the scientific and technological literature analysis research experiment,relevant researchers do not know which semantic representation method of keywords is more appropriate.What's more,in the face of many new methods,it is also unable to effectively discriminate the appropriate method,when they do scientific domain literature analysis.In practice,how to choose semantic representation of keywords? Aiming at dealing with the above problems and in order to compare the existing literature keyword semantic representation methods standard and quantitatively,this study chose the keywords that best represent the theme of the domain scientific literature as the basic analysis objects.And literature keywords clustering was taken as the scenario in this study.Therefore,we chose "digital library" domain knowledge map as a standard to evaluate dataset("gold standard")to fitting.For semantic representation methods of domain literature keywords,23 varieties of five basic methods including common word matrix,common word network,word representation learning,network representation learning and graph neural network are selected(including pre train-fine tune model and semantic + structure model)are selected.And the weighted Jaccard coefficient is used to evaluate the fitting indexes of k-means clustering and hierarchical clustering results for the "gold Standard".Based on the above experimental scheme,in order to carry out experiments in multiple fields to reflect the universality of the research results,a comparative experiment of semantic representation of keywords in scientific literature in the field of "digital library" in two scenarios was conducted.After two experiments in the field of "digital library" at two task scenarios,the conclusion of this study is as follows:(1)On the whole,there is a significant gap between the reliability of the conclusions and the expected results of the experts in the field by only using the computer technology to carry out the research on the domain research hotspots and subject structure analysis using the keywords in the literature.(2)And "pre train-fine tune" model in word representation learning and "semantic + structure model" in certain circumstances is better.To be specific:(1)In the task oriented domain knowledge organization system construction, from the perspective of the two clustering algorithms selected in this study,if the ideal clustering effect is to be pursued,the k-means clustering method should be preferred to the hierarchical clustering method.Using co-words matrix method,network representation learning method,or graph neural network embedding method to get keywords semantic representation is poorer,and traditional co-words network method or word representation learning method is better.(2)In the task oriented high frequency word analysis,except that the "pre train-fine tune" model in word representation learning and "semantic + structure model" in certain circumstances can achieve better results,there is no significant difference between the results obtained using other methods.Finally,based on the quantitative evaluation conclusions obtained in this study,this paper proposes several suggestions for information workers of different technical levels regarding literature keyword analysis tasks.
Keywords/Search Tags:scientific literature, keyword analysis, semantic representation, co-word analysis, co-word network, representation learning
PDF Full Text Request
Related items