Font Size: a A A

Enhancing a domain-specific digital library with metadata based on hierarchical controlled vocabularies

Posted on:2006-09-07Degree:Ph.DType:Dissertation
University:Oregon Health & Science UniversityCandidate:Weaver, Mathew JonFull Text:PDF
GTID:1458390005492526Subject:Environmental Sciences
Abstract/Summary:
Natural resource managers make decisions that affect numerous organizations, individuals, and the environment. Such decisions are based on a broad range of information gleaned from a variety of reports such as Decision Notices, Environmental Analyses, and Environmental Impact Statements---as well as various specialist reports that provide detailed scientific findings and evaluation. Many of these documents are authored by a multi-disciplinary team of experts who routinely use a wide range of terminology. The documents often contain diverse content including text, maps, scientific data, and images. This conglomeration of terminology and heterogeneous documents presents an interesting information retrieval challenge.; Our primary objective is to design, construct, and evaluate a domain-specific digital library, called Metadata++, with a focus on natural resource managers. The digital library emphasizes specialized terminology, including terms from a large number of well-established, well-known classification schemes and terminologies used by multi-disciplinary experts such as soil scientists, fish biologists, wildlife biologists, fire specialists, and hydrologists. These specialists frequently use the same terms with often subtle (and occasionally significant) differences in meaning. This dissertation presents a path-based thesaurus model that supports polyhierarchies, by distinguishing multiple occurrences of a term using the full path in the hierarchy, as well as the typical thesaurus relationships of synonymy and association. Instead of designating a single preferred term for each concept, multiple terms with the same path (that are used interchangeably) can be listed together, separated by commas. All terms are path-based and provide the framework for the entire system---including browsing, indexing, interactive search expansion, and hierarchical search results.; We describe a study that evaluates the Metadata++ library system and assesses how easily indexers and searchers understand the path-based representation of terms. We describe multiple implementations and experiments that lead to a backend storage and retrieval mechanism optimized specifically for path-based metadata. The user interface consists of a smart client application, combined with web services, that satisfies specific architectural design objectives. We explain how Metadata++ integrates with a standard geographic information system to support both spatial and keyword-based information retrieval. We conclude with a comparison to other thesaurus-based systems and a description of future work.
Keywords/Search Tags:Digital library, Metadata, Information
Related items