Font Size: a A A

Research On Key Technologies Of Ontology-Based Web Information Integration

Posted on:2005-03-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:K ZhangFull Text:PDF
GTID:1118360125467576Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Web offers an abundant and valuable information repository. So it is an important and meaning subject to research how discovery and obtain valuable information in Web repository. The data of Web has characteristics such as semistructured, heterogeneous, distributed etc. These characteristics make the information integration on Web a challenging work.Aim at Mediator Based Hybrid (MBH) approachs mainly, the thesis addresses several key technical problems of ontology-based Web information integration, which covers ontology concepts extracting from objects set based on FCA, information extracting from Web table, mediator ontology constructing in MBH, flexible queries over ontology, query rewriting in MBH. Major contributions of this thesis include:1. An IRST-based algorithm is proposed for construction of concept lattices to elevate the performance of concepts extracting from objects set. The algorithm first uses the Inter-Relevant Successive Trees model to transform formal context, then extract concepts from formal context by technologies of finding frequent itemsets in Data Mining. In the extracting procedure, the algorithm avoids generation of lots of candidate attributes sets. Compared with some other algorithm for construction of concept lattices, complexity analysis and experiment illustrate that the algorithm we proposed is more efficient.2. Aim at Chinese information, a regular expressions based, Web table oriented information extracting method is proposed. In order to extract Chinese information from Web tables, we analyse the characteristics of Chinese phrase which expresses concept in ontology, generalize a kind of dialect pattern using regular expressions from phrase set which expresses the same concept. We use the dialect pattern matching to complete mapping from Web information to concept in ontology and design strategy to resolve matching conflict at the same time. Experiment illustrate that this information extracting method for Web tables is applied.3. Illumined by multiple viewpoints theory in requirement engineering, a multiple-viewpoints-based mediator ontology constructing approach is proposed. We view each local ontology as one viewpoint of the mediatorontology, use inconsistency checking among each local ontology and heuristic rules which reason relations between concepts in different local ontology to obtain mediator ontology. The multiple-viewpoints-based mediator ontology constructing approach keeps semantic among local ontologies and mediator ontology.4. We induct the concepts of flexible and semiflexible queries in ontology queries. When the ontology graph is a tree, we proposed an efficient approach to evaluate semiflexible queries. This approach use tree index and theorem about leaforder region to evaluate semiflexible queries over ontology tree. Compared with the custom evaluation approach, the approach we proposed is more efficient illustrated by experiment.5. A Web information integration architecture is proposed. Based on this architecture and requirement in project, we exploit a prototype for Web information integration. This prototype has functions such as ontology management, Web information extraction, query rewriting etc.
Keywords/Search Tags:ontology, MBH, mediator ontology, concept lattices, multiple viewpoints, flexible queries, query rewriting
PDF Full Text Request
Related items