Machine Reading: from Wikipedia to the Web

Posted on:2011-05-18

Degree:Ph.D

Type:Dissertation

University:University of Washington

Candidate:Wu, Fei

Full Text:PDF

GTID:1468390011471430

Subject:Engineering

Abstract/Summary:

Berners-Lee's compelling vision of a Semantic Web is hindered by a chicken-egg problem, which can be best solved via machine reading --- automatically extracting information from natural-language texts to make them accessible to software agents. We argue bootstrapping is the best way to build such a system. We choose Wikipedia as an initial data source, because it is comprehensive, high-quality, and contains enough collaboratively-created structure to launch a self-supervised bootstrapping process. We have developed three systems that realize our vision: • KYLIN, which applies Wikipedia heuristic of matching sentences with infoboxes to create training examples for learning relation-specific extractors. • KOG, which automatically generates Wikipedia Infobox Ontology by integrating evidence from heterogeneous resources via joint inference using Markov Logic Networks. • WOE, which uses Wikipedia heuristic to create matching sentence set as done in KYLIN, but it abstracts these examples to relation-independent training data to learn an unlexicalized open extractor.;The results of our experiments show that these automatically learned systems can render much of Wikipedia into high-quality semantic data, which provides a solid base to bootstrap toward the general Web.

Keywords/Search Tags:

Wikipedia

Related items

1	Mining The Quality Of The Content In Wikipedia
2	A Study On The Characteristics Of Academic Papers Citing Wikipedia
3	Semantic Relevance Metric Algorithm Research Based On Wikipedia
4	Study On Acquiring Non-taxonomic Relations From Wikipedia-like Encyclopedic Web Sites
5	Essays analyzing blogs and Wikipedia
6	Methode de recherche adaptative sur le Web avec utilisation de Wikipedia pour l'expansion de requetes
7	Typifying Wikipedia articles
8	Research And Implementation On Method Of Entity Linking Baseed On Wikipedia
9	Hackers, cyborgs, and Wikipedians: The political economy and cultural history of Wikipedia
10	Automatically Extracting Semantic Relations From Wikipedia Text