Open Information Extraction From Chinese Patents Using Markov Logic

Posted on:2015-09-02

Degree:Master

Type:Thesis

Country:China

Candidate:Q M Zhao

Full Text:PDF

GTID:2298330467968632

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Chinese patent, which is one of the most important components of the science andtechnology literatures, contains a great quantity of scientific research and technologicalinnovation knowledge described in natural language. In general, it is very difficult to calculateand even understand the unstructured knowledge for the computer. So the research oninformation extraction that transforms semi-structured or unstructured free-text to structureddata indicates the promising significance.In last two decades, people paid particular attention to developing information extractiontechnology. However, traditional information extraction has focused on satisfying precise,narrow and pre-specified relation, which leads to poor scalability, such as extensive humaninvolvement, high dependence on special domain and the complexity of matching patterns.That is why the research on information extraction is being shifted into open informationextraction from small homogeneous and target relations to open domains and relations.In recent years, in contrast with the significant achievements concerning English andother western languages, research on Chinese open information extraction is quite scarce. Sothis thesis presents two researches on Chinese patent documents.Firstly, a new approach is proposed, which is oriented to bilingual patent abstract, torecognize the MNP of Chinese patent text. We make use of three types of information (wordinformation in sentences, transferred information from TreeBanks and bilingual information),that is based on the joint framework of MLN, to recognize the bounds of MNP. Theexperiment results show that bilingual information has great positive effect on identificationof verbs, and the F-score of MNP evaluation reaches83.27%. The performance is greatlyenhanced, compared to the golden Berkeley Parser’s60.09%. What’s more, the new approachis simple and easy to expand.Secondly, the hierarchical Chinese open entity relation extraction approach is proposedthat applies Markov Logic Networks on the base of both external and internal chunk-tags.And the corpus for the MLN model is obtained by employing the self-learning method semi-automatically. The experiment results reveal that the start from chunks can simplify theunderstanding of sentences, and both layers can be handled consistently so that engineeringefforts are reduced. And on the same conditions, MLN can perform better than SVM, inwhich the F-score of external and internal layers can reach77.92%and69.20%respectively.

Keywords/Search Tags:

Open Information Extraction, Markov Logic, Transfer Learning, SVM

PDF Full Text Request

Related items

1	Research On Markov Logic Networks And Its Application
2	Research And Application Of Transfer Model Based On Relational Mapping
3	Learning with Markov logic networks: Transfer learning, structure learning, and an application to web query disambiguation
4	Research On Knowledge Inference And Verification For Open Information Extraction System
5	Research On Technology Of Spam Filtering Based On Markov Logic Network
6	Research And Implementation Of Text Categorization Of Network Public Opinion Based On Markov Logic Networks
7	Markov Retrieval Model Based On Transferring Learning
8	Q & Answers Sorting Method Based On Markov Logic Network Research
9	Research On Domain Knowledge Learning And Update Techology Based On Markov Logic Networks
10	Neural Network-based Open Information Extraction And Its Application