Font Size: a A A

An ontology-driven concept-based information retrieveal approach for Web documents

Posted on:2011-12-31Degree:Ph.DType:Thesis
University:University of Alberta (Canada)Candidate:Li, ZhanFull Text:PDF
GTID:2448390002464016Subject:Engineering
Abstract/Summary:
Building computer agents that can utilize the meanings in the text of Web documents is a promising extension of current search technology. Concept-based information retrieval applies "intelligent" agents to identify Web documents that match user queries. A new concept-based information retrieval framework, Hybrid Ontology-based Textual Information Retrieval (HOTIR), is introduced in this thesis. HOTIR accepts conventional keyword-based queries, translates them into concept-based queries, enriches definitions of concepts with supplementary knowledge from a knowledge base, and ranks documents by aggregating "equivalent" concepts identified in them. The concept-based queries in HOTIR are organized in a hierarchy of concepts (HofC) and definitions of concepts are added from a knowledge base to enhance their meanings. The knowledge base is a modified ontology (ModOnt) that can enrich the HofC with concept definitions in the form of related-concepts, terms, their importance values, and their relations. The ModOnt relies on an adaptive assignment of term importance (AATI) scheme that continuously updates the importance of terms/concepts using Web documents. The identified concepts in a Web document that match those in the HofC are evaluated using ordered weighted averaging (OWA) operators, and documents are ranked according to the degree to which they satisfy the HofC. The case studies and experiments presented in the thesis are designed to validate the performance of HOTIR.
Keywords/Search Tags:Web documents, Concept-based information, HOTIR, Hofc
Related items