Font Size: a A A

Research On Domain Ontology Construction And Fine-grained Entity Classification Methods Based On Sparse Labeling

Posted on:2022-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:H Y WuFull Text:PDF
GTID:2518306728974909Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the arrival of big data,more convenience is provided for people to obtain information,but at the same time,it also brings the problem of information explosion.How to efficiently mine valuable information has become a hot spot in the world.Therefore,the knowledge graph correlate various information comes into being.Domain ontology construction and entity classification are two important subtasks of knowledge graph construction,and the basis of domain ontology construction is domain word extraction.Therefore,we focus on the two tasks of domain word extraction and entity classification in this paper.Firstly,The domain word extraction algorithm based on linguistic rules and Bert embedding is proposed to solve the problem of annotated corpus scarce in domain word extraction.The algorithm uses linguistic rules to separate strictly non-domain words from the domain text,and adds them to the original sparsely labeled corpus,expands the corpus.Then the domain word extraction is realized through BERT-based word classification model,so as to improve the accuracy of domain word extraction.Secondly,in response to the scarcity of labeled corpus in fine-grained entity classification task,this paper applies task-agnostic meta-learning methods to fine-grained entity classification tasks for the first time,and builds a task-agnostic fine-grained entity classification algorithm based on the pre-trained BERT model.The algorithm first uses the BERT pre-training model to map each word in the instance to a low-dimensional vector space,and a fine-grained entity classification model based on multi-task is proposed by using the task-agnostic meta-learning method.Then regular terms are added on the basis of the predicted result to reduce the inequality between tasks and improve the generalization ability for new task.Finally,the proposed domain word extraction method based on linguistic rule and the entity classification based on task-agnostic meta-learning are tested on custom datasets and public datasets,respectively.The experimental results show that the algorithms proposed in this paper have better performance.
Keywords/Search Tags:domain ontology construction, domain word extraction, linguistic rules, fine-grained entity classification, task-agnostic meta-learning
PDF Full Text Request
Related items