Font Size: a A A

Subdivision Domain Entity Recognition And Application Under The Guidance Of Small-scale Knowledge Base

Posted on:2023-12-08Degree:MasterType:Thesis
Country:ChinaCandidate:J B PengFull Text:PDF
GTID:2558307070953599Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the advent of the big data era,a large number of documents have been accumulated in various domains.Facing the complex documents,we hope to quickly and clearly understand the main research contents and core knowledge in some domains.Therefore,it is urgent to identify and extract knowledge entities from a large number of domain documents,and carry out applications such as domain knowledge organization,knowledge graph construction,knowledge evolution analysis and so on.At present,the biggest problem faced by the mainstream supervised learning schemes in domain entity recognition task is that they rely heavily on manually labeled corpus.On the one hand,it leads to the high cost of domain entity recognition task.On the other hand,various models should be adjusted when applied to subdivided domains,and thus the domain generalization ability of the model is poor.To solve this problem,guided by the small-scale domain knowledge base,we divided the domain entity recognition task into two subtasks,that was entity boundary recognition and entity classification,so as to make full use of the existing domain resources.Entity boundary recognition was regarded as an English word segmentation task,which could be well solved by using large-scale domain document keywords and glossary.Entity classification was essentially a multi classification task of phrases.It could be well solved by constructing training data at a low cost through the existing domain glossary,combined with different models and text features.In order to verify the effectiveness of the methodology in this paper,we took the domain of artificial intelligence as an example,to identify the problem entities and solution entities in the documents.The experimental results showed that among various models integrating different text features,SVM had the best performance,and the values of P,R and F1 reached0.8526,0.8496 and 0.851 respectively.Finally,the optimal model was applied to the extraction of problem entity and method entity in all documents.Based on the results of entity extraction,the knowledge evolution analysis in the domain of artificial intelligence was carried out.In practice,the evolution of problem-solving oriented artificial intelligence domain knowledge was displayed from the macro,meso and micro levels.At the macro level,the evolution of overall domain knowledge was compared and analyzed by using entity relationship networks in different time segments.At the meso level,the evolution trend of two kinds of knowledge entities was revealed by using the evolution heat map of high-frequency problem and solution entities.At the micro level,taking computer vision as an example,we showed the heat map of solutions of individual problem and several typical evolution trends.
Keywords/Search Tags:Domain Entity Recognition, Low Resource Entity Recognition, Entity Boundary Recognition, Entity Classification, Knowledge Evolution Analysis
PDF Full Text Request
Related items