Font Size: a A A

Research And Design Of Semantic-Based Web Information Extraction System

Posted on:2007-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:X T LiuFull Text:PDF
GTID:2178360212972163Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of Internet, the Internet has become one of the most important knowledge repositories. It is highly desirable to achieve efficient information extraction. It has become an important research issue of how to offer efficient information automatically from Internet to the users. The information extracted by IE(Information Extraction) systems not only provide for the user, but also is the first step to build an intelligent query system.This paper presents the background, concept, history of information extraction, reviews the state of Internet information extraction, and analysis several important tool of information extraction. And we summarize three disadvantages of current relative techniques. According the disadvantage of being short of semantic meanings, we put forward the idea of linking the technique of semantic web with that of information extraction.A novel technique for semantic wrapper is proposed in this paper. We extend the present syntactic wrappers, which transfer the interesting information of web document into a structure one(XML document), so that they can extract knowledge from the web. The extended part is the semantic interpreter. It automatically adds meaning to the information extracted by the wrapper.Finally, we design and realize an algorithm to complete the transformation from the structured document to the semantic document. Moreover, we use relative techniques about ontology and use RDF to describe the ontology in certain domain.
Keywords/Search Tags:Information Extraction, RDF, ontology, semantic wrapper, semantic interpreter
PDF Full Text Request
Related items