Font Size: a A A

Knowledge Unit Mining And Flow Pattern Research Based On Bi-LSTM-CRF Model

Posted on:2021-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:S L YeFull Text:PDF
GTID:2428330611969760Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
Knowledge unit refers to the microscopic particle containing knowledge information,which is the carrier of knowledge information and the embodiment of knowledge in fine granularity,and plays an indispensable role in the development of social civilization.Citation context is a kind of descriptive text used by the citation agent when citing reference,which contains the citation motivation,emotion and purpose of the citation agent.The establishment of a scientific,reasonable and efficient machine learning model,the extraction of knowledge units in the context of reference and the study of their flow patterns can help scholars understand the update and development of knowledge and technology,and it is of great practical significance for scholars to provide directions and ideas in their research field.In order to solve the problem of the data set of knowledge units in the citation context that has not yet been publicly published,this paper selects 1000 text data on Pub Med from 2008 to 2018 in proportion to the biomedical field,extracts information such as the citation context and marks the knowledge units,and constructs a relatively complete data set.At the same time,the general semantic feature,character feature,case feature and brown clustering features based on word vector are extracted,and the knowledge unit mining model of Bi-LSTM-CRF was constructed.The CRF model and Bi-LSTM-CRF model were compared through experiments.The experimental results show that the Bi-LSTM-CRF model has a better recognition effect and is better than the CRF model in the three evaluation indexes.The precision is 0.7618,the recall rate is 0.7099,and the F?1 value is 0.7349,which is significantly improved compared with the CRF model in the F?1 value.In order to study the flow pattern of knowledge units in citation context,the heterogeneous information network was used to analyze and visualize the results from macro and micro levels.At the macro level,the research results show that in the field of biomedical,compared to other three types of knowledge unit,the fourth class specific areas of knowledge unit in the citation context to the highest proportion,scholars tend to cite more knowledge units in their own research field when citing references,and the experimental results also consistent with the objective facts.At the micro level,we can clearly and intuitively see the flow path of a specific knowledge unit between papers,and grasp the rise and development direction of knowledge units.
Keywords/Search Tags:Citation, Knowledge unit, Bi-LSTM-CRF, CRF, knowledge flow
PDF Full Text Request
Related items