Font Size: a A A

Research On Key Techniques For AIE-based Semiautomatic Annotation Of Web Page

Posted on:2006-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:H X ZhuFull Text:PDF
GTID:2178360182956636Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The key to realize the Semantic Web is to create ontologies and semantically markup (i.e., annotate) Web content with ontology terms. Integrating information extraction (IE) techniques into annotation tools would enhance greatly the automation degree of semantic annotation tools. The adaptive IE system (AIE) uses machine leaning technique to learn extraction rules from the training data to adapt to new application domain, which is just the requirement in semantic Web environment.This paper analyses the technical features of available semantic annotation tools that integrate AIE systems. Two main deficiencies of these tools are indicated: they don't support OWL, the W3C recommendation of Web ontology language, and they use IE functions in a simple way. According to this, this paper first presents an AIE-based semiautomatic Web pages annotation framework. It supports OWL Lite ontology language and uses the functions of automatic information extraction provided by Amilcare (an AIE system) to realize semiautomatic semantic annotation of Web pages. To the deficiencies of existing semantic annotation tools mentioned above, two key techniques in our annotation framework are chosen for thorough research: CD Amilcare-based semiautomatic extraction techniques of the facts to be annotated by using an active learning method; ?the generation techniques of OWL Lite semantic metadata using the extracted facts. Through deep research on Amilcare and its API, This paper presents the method and process for Amilcare-based active learning to extract semiautomatically facts to be annotated in semiautomatic annotation tools. Based on the analysis and summary of OWL Lite language constructors and restrictons, this paper also describes the semantic metadata structure of Web pages and explains design ideas and implemention techniques of semantic metadata generation using the extracted facts. Case study and experiment shows that the technical scheme is reasonable and feasible.
Keywords/Search Tags:semantic annotation, adaptive information extraction, ontology, OWL, semantic web
PDF Full Text Request
Related items