Font Size: a A A

Building topic-specific search engines: A data mining approach

Posted on:2002-12-10Degree:Ph.DType:Dissertation
University:University of California, Los AngelesCandidate:Yi, JeongheeFull Text:PDF
GTID:1468390011491664Subject:Computer Science
Abstract/Summary:
Topic specific search engines are becoming popular with the phenomenal growth of the World Wide Web. They have higher accuracy rate than general purpose search engines, and offer functions they cannot provide. But the topic-specific search engines available nowadays have very low cost-efficiency, because they require intensive human labor, and thus enormous cost, to upkeep as well as to build. Efficient processing of the exploding information in the World Wide Web seems to call for smarter search engines, topic-specific search engines that require far less human labor while performing almost as well as those built and maintained by humans. This dissertation is a contribution towards meeting this demand. Building and maintaining topic-specific search engines with minimal human labor requires an automatic or semi-automatic information gathering system, the outputs of which can be fed to the search engines. In the dissertation, I discuss techniques for four major components of the requisite information gathering system: (1) Domain information extraction; (2) Topic expansion; (3) Topic-driven information gathering; (4) Text-classification system for web documents.; I also discuss the performance of the prototype system, a search engine for XML, that I built to test the techniques.
Keywords/Search Tags:Search, System
Related items