Research On Domain Adaptive Chinese Entity Relation Extraction

Posted on:2012-05-06

Degree:Master

Type:Thesis

Country:China

Candidate:L F Wang

Full Text:PDF

GTID:2218330362450415

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid popularization of computers, and the Internet's rapid development, the amount of information is becoming more and more. So, how to quickly and accurately obtain necessary information from the massive data becomes a topic of concern. The main purpose of information extraction is to transform unstructured natural language text into semi-structured or structured data, easy for people to obtain key information quickly and accurately. Relation extraction as one of the subtask and key technology of information extraction, has gradually become an important supporting technique for many natural language processing tasks.Traditional relation extraction methods required pre-defined relation types, and rely on large amount of manually annotated training corpora. So they are difficult to meet the needs of the Internet massive information processing. We propose a new relation extraction research framework to explore the maximum to avoid human intervention, and has a strong domain adaptive capacity, in order to improve the automaticity and enhance protability of relation extraction.First, by analyzing the linguistic phenomenon of the relation instances context, we found the vast majority of the entity pairs which generating some semantic relations could be trigged or described by the general verbs and nouns (referred to as feature words), so this paper proposes the feature words clustering method, which can discover relation types from a certain amount of unlabeled corpus automatically, and can be compared with predefined result with the artificial. Second, for the large number of relation types to be processed, this paper proposes the Web Mining based relation seed extraction method, which can make full use of search engine's large-scale data collection and processing capabilities and advantages, to extract the representative relation core network. The method gets an average precision of 90.91% on selected nine relation types. Next, according to Chinese linguistic characteristics, this paper defines the general context pattern and its generalization, then introduces the bootstrapping method. The method takes the relation core network as input, then iteratively generates the relation description patterns and extracts relation tuples. Through manual evaluation on the sampling relation tuples, the average precision achieves 88.24%, meets the practical needs.Finally, a domain adaptive relation extraction platform named XInfo is designed and implemented, on the platform, researchers can focus on algorithm improvement and research, then make rapid experiment. Also, XInfo can provide support for natural language processing research and applications. In addition, this paper takes the social relations between people as an application task, and develops a online demo system to show relation extraction results in an intuitive and clear way.

Keywords/Search Tags:

Relation Extraction, Domain Adaptive, Relation Type Discovery, Relation Seed Extraction, Relation Description Pattern Mining

PDF Full Text Request

Related items

1	Extraction Of Entity Hyponymy And Synonymy Relations From Open Domain Texts
2	Research On Entity Relation Extraction In Network Encyclopedia
3	Research And Implementation Of Syntactic Pattern Recognition Approach For Chinese Relation Extraction
4	Relation Extraction From Complex Text In Open Domain
5	Chinese Entity Relation Discovery For Bigcilin
6	Research On Semi-supervised Entity Semantic Relation Extraction
7	Research On Entity Relation Extraction Algorithm Based On Semi-supervised Machine Learning
8	Research On Methods Of Relation Extraction Based On Relation Correlations
9	GCN Relation Extraction Method Integrating Degree Information, Semantic Position Attention And Dual Type Embeddin
10	Research On Joint Entity Relation Extraction Based On Deep Learning With Pointer Annotation