Font Size: a A A

Research Of Extracting Structured Ontology From Wikipedia Infoboxes

Posted on:2019-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:X Y XuFull Text:PDF
GTID:2428330572958213Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays,data mining and data analyzing are exactly popular research topic in computer science,also the direction which commercial internet company make effort to.People who control the data and the analysis method are holding key technology in internet development.Obviously,this research field is full of value in academic way and commercial way.Wikipedia is an open content online encyclopedia.Over millions of editors around the world are contributing articles to Wikipedia.Wikipedia holds abundant valuable data,lots of academic organizations are contributing to processing the Wikipedia data.Based on this situation,we read and refer papers in data processing field,analyze the term structure in Wikipedia.As a result,we find a blank area to do research.We extract structured information from Wikipedia infoboxes based on ontology theory to help formalize Wikipedia term structure and fulfill the knowledge graph in knowledge base.Our objective is to discover possible class structure in infoboxes based on a given set of member articles.We consider three class relationships within infobox attributes,and describe their influence in formalize Wikipedia term Structure.We define several candidate features help to find the attribute relationships in infoboxes.In our research,we use ontology mapping methods to extract structured information from infobox,calculate similarity for attribute pairs,and construct special ontologies to illustrate relationships between attributes in infoboxes.The result shows that our method is available and scalable,can extract class relationship effectively and help to formalize Wikipedia Structure.
Keywords/Search Tags:Wikipedia, Structured Information, Ontology, Class Hierarchy
PDF Full Text Request
Related items