Font Size: a A A

T2LD - An automatic framework for extracting, interpreting and representing tables as Linked Data

Posted on:2011-10-29Degree:M.SType:Thesis
University:University of Maryland, Baltimore CountyCandidate:Mulwad, VarishFull Text:PDF
GTID:2448390002964786Subject:Computer Science
Abstract/Summary:
We present an automatic framework for extracting, interpreting and generating linked data from tables. In the process of representing tables as linked data, we assign every column header a class label from an appropriate ontology, link table cells (if appropriate) to an entity from the Linked Open Data cloud and identify relations between various columns in the table, which helps us to build an overall interpretation of the table. Using the limited evidence provided by a table in the form of table headers and table data in rows and columns, we adopt a novel approach of querying existing knowledgebases such as Wikitology, DBpedia etc. to figure the class labels for table headers. In the process of entity linking, besides querying knowledgebases, we use machine learning algorithms like support vector machine and algorithms which can learn to rank entities within a given set to link a table cell to entity. We further use the class labels, linked entities and information from the knowledgebases to identify relations between columns. We prototyped a system to evaluate our approach against tables obtained from Google Squared, Wikipedia and set of tables obtained from a dataset which Google shared with us.
Keywords/Search Tags:Table, Data, Linked
Related items