Classification and relation extraction for semantic Web annotation

Posted on:2006-08-31

Degree:M.C.Sc

Type:Thesis

University:Dalhousie University (Canada)

Candidate:Satti, Asad R

Full Text:PDF

GTID:2458390008973341

Subject:Computer Science

Abstract/Summary:

The idea of the Semantic Web (SW) is based on metadata and the future usage scenarios of the SW assume the availability of metadata needed by agents and computer programs to perform sophisticated tasks. We have proposed an architecture needed to generate metadata for the SW. Our proposed architecture is designed to generate metadata from HTML Web Pages from the domain of our interest. The domain model is represented by using a domain ontology. The focus of this thesis is to investigate the automatic generation of metadata for the SW by using classification and relation extraction techniques.; The classification module takes a domain ontology and Web Pages as an input and classifies Web Pages into ontology classes. The performance of several classification algorithms is explored on web pages of the Four Universities dataset using page text, with a limited-size feature set. Results show that K-NN is the best classifier in case of biased attribute selection with 97% average F-Measure score over all the classes and RIPPER is the best in case of unbiased attribute selection with 49% average F-Measure score over all classes and 60% average F-Measure score over four classes that show the best performance.; For relation extraction our assumption is that any two Web Pages that have some kind of relationship must be connected by a link or a path. Our algorithm exploits the link structure of Web using breadth first search for relation extraction. Relation extraction results show that the hyperlink structure of the Web can be used for relation extraction.

Keywords/Search Tags:

Relation extraction, Semantic web, Average f-measure score over, Web pages, Metadata, Results show

Related items

1	Research Of Automatic Metadata Extraction From Template Web Pages
2	Design And Implementation Of Semantic Model Extraction For Academic Literature Retrieval Results
3	Based On A Summary Of The Semantic Relation Extraction
4	Chinese Entity Relation Extraction Base On Syntactic And Semantic Analysis
5	Research On Chinese Named Entity Semantic Relation Extraction Based On Dependency Tree
6	Research On Semi-supervised Entity Semantic Relation Extraction
7	Research On Entity Relation Extraction In Network Encyclopedia
8	Research On Chinese Feature-based Semantic Relation Extraction Between Named Entities
9	Research On Metadata Extraction Approach For PDF Document Papers
10	Research On Entity Relation Extraction Algorithm Based On Semi-supervised Machine Learning