Font Size: a A A

Applications Of Markov Logic In Citation Matching And Chinese Named Entity Recognition

Posted on:2010-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z H LiaoFull Text:PDF
GTID:2178360275952291Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the research field of artificial intelligence(AI) many(if not most) real-world application domains,such as knowledge representation,automated reseaning,machine learning,planning and natural language processing,are characterized by the presence of both uncertainty and complex relational structure.Probabilistic graphic models can effectively handle uncertainty,while first-order logic can compactly represent a wide variety of complexity.Clearly,AI needs to combine probability and first-order logic in a single representation.Markov logic as such representation meets its demand.A Markov logic network(MLN) is a first-order knowledge base with a weight attached to each formula,and can be viewed as a template for constructing Markov networks.Many key tasks in statistical relational learning,like collective classification,link prediction,link-based clustering,social network modeling,and object,identification,are naturally formulated as instances of MLN learning and inference.The thesis focuses around applying Markov logic to citation matching and Chinese named entity recognition.And the main achievements of this thesis include:Firstly,aiming at improving the accuracy of segmentation of citation records,we propose an alternative joint inference approach-Generalized Joint Segmentation in citation matching.It can effectively deal with such problem that the type of dataset is unknown or not sure to be classified to "dense" type,"sparse" type or hybrid type.Especially,in hybrid type datasets analysis there is often no priori information for choosing Joint Segmentation method or Joint Segmentation Entity Resolution method to process segmentation and entity resolution.What is worse,both methods might produce many errors.Fortunately,our method can effectively avoid error of segmentation and produce well field boundaries.Results on several citation datasets show that our method outperforms many alternative approaches for citation matching.In particular,for Generalized -Jnt-Seg(j=4) the performance can be considered to be optimal and perfect when taking computational cost and burden of storage of RAM as well as time cost of learning weight and inferring into account.Secondly,in order to solve the challenging Chinese named entity recognition task,we present a novel approach using hierarchical hybrid model to recognize Chinese named entity.This model consists of three mutual dependent stages,namely,boosting,Markov logic networks,and detecting abbreviated named entities.The first stage is to recognize the simple named entities using AdaBoost algorithm.The second stage is to recognize the complex arid compound named entities using Markov logic networks.And the final stage aims to classify the abbreviation of named entities by means of global information in the same document.Experiments on People Daily corpus show that our approach significantly improves the performance of recognition.And the result of precision,recall and F-score can achieve 96.38%,95.89%and 96.47%,respectively.In sum,the two experiments in citation matching and Chinese named entity recognition demonstrate Markov logic as a powerful statistical relational models,and illustrate the benefits of using MLNs over purely logical and purely probabilistic approaches.Finally,the characters of MLNs are summarized and the directions of furore work are listed.
Keywords/Search Tags:Markov logic, Citation matching, Chinese named entity recognition
PDF Full Text Request
Related items