Font Size: a A A

Chinese Nested Named Entity Recognition And Relation Extraction

Posted on:2019-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q LiFull Text:PDF
GTID:2428330545451196Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nested named entity contains rich entities and semantic relations between them,so nested named entity recognition and relation extraction is one of the important steps in information extraction.Due to the lack of standard Chinese nested named entity corpora for recognition and relation extraction,it is difficult to compare the research works on Chinese nested named entity recognition and to make progress in Chinese nested named entity relation extraction currently.Therefore,this article mainly includes the following three aspects:(1)Using a semi-automatic method to construct Chinese nested named entity corpora.Firstly,construct nested named entities as much as possible automatically from the annotated information in the existing Chinese named entity corpus.Then,manually adjust to satisfy the annotation requirements of Chinese nested named entity.The preliminary experiment of nested named entity recognition within and across the corpora shows that Chinese nested named entity recognition is still a challenging task.(2)Proposing a method of automatically constructing a Chinese nested named entity recognition corpus from Chinese Wikipedia.First,the Chinese Wikipedia entries are classified into entities.Then,nested structures of entities are constructed from these entries to generate a large-scale Chinese nested named entity recognition corpus.(3)Annotating a Chinese nested named entity relation corpus manually and performing the Chinese nested named entity relation extraction experiment via Support Vector Machines and Convolutional Neural Network models.Experiments show that the entity relation extraction on the corpus based on manually labeled entities performs excellently and achieves an F1 score of over 95%.This paper shows that Chinese nested named entity recognition is a challenging task.Although the automatically constructed Chinese nested named entity recognition corpus cannot match the manually annotated corpora in quality,it has larger scale and potential in adaptability to various domains.Finally,due to the rich structured information between nested named entities,the performance of entity relation extraction is excellent.Thence,the key issue of information extraction from nested entities is how to improve the performance of nested entity recognition.
Keywords/Search Tags:nested named entity recognition, Wikipedia, corpus, relation extraction
PDF Full Text Request
Related items