The Research Of Open Information Extraction System

Posted on:2021-07-12

Degree:Master

Type:Thesis

Country:China

Candidate:J L Zhan

Full Text:PDF

GTID:2518306503980369

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Information extraction is about how to extract useful knowledge from huge amount of text which becomes a more and more popular research field.The work of this paper belongs to two sub-fields of information extraction:open information extraction(Open IE)and name entity recognition(NER).For the former,this paper makes contribution in both model and data usage.For the model,this paper proposes a span selection model which outperforms other Open IE models.For the data,this paper uses the confident score as coefficient of loss function which can extract more patterns from noisy training data.This paper also points out some annotation problems of existing Open IE benchmark and relabels a new benchmark for future research.The second work of this paper is a fully unsupervised NER system which relies only on word embeddings.This system consists of two modules:an NE detection module based on Gaussian Hidden Markov model and an NE type classification module based on Deep Auto-Encoder Gaussian Mixture Model.The novelty of this model is the combination of unsupervised machine learning algorithms and pre-trained word embeddings.Since this system is independent of annotated corpus,it is more robust than other supervised or semi-supervised NER systems.

Keywords/Search Tags:

Open Information Extraction, Unsupervised Name Entity Recognition, Noisy Data Processing, Pre-trained Word Embeddings

PDF Full Text Request

Related items

1	Research On Open Relation Extraction And Classification Based On Word Embeddings
2	Research And Application Of Named Entity Recognition Method For Dialogue Domain
3	Research And Implementation Of Open Information Extraction Method For Automatic Construction Of Knowledge Graph
4	Research On Entity Extraction In Signal Processing Based On Dependency Word Vector
5	Design And Implementation Of Large-Scale Open Information Extraction System
6	Research On Chinese Named Entity Recognition Based On XLNet And Word Segmentation Fusion Coding
7	Research Of Joint Extraction Of Entities And Relations Based On Pre-trained Model
8	Research On Information Extraction With Complex Entity
9	Research On Named Entity Recognition Method For Network Security Domain
10	Design And Implementation Of Chinese Microblog Oriented Product Named Entity Recognition And Normalization Algorithm