Font Size: a A A

Using indexes and data cubes to support browsing and summarization of information in databases and digital libraries

Posted on:2000-01-14Degree:Ph.DType:Dissertation
University:University of California, Santa BarbaraCandidate:Geffner, Steven PaulFull Text:PDF
GTID:1468390014961865Subject:Computer Science
Abstract/Summary:
Traditional database query models assume a “single interaction” paradigm, where a user passes a complete query to the database management system (DBMS) and the DBMS passes back the results of that query. Sometimes users wish to “browse” the data contained in the database in an interactive fashion, and what they find will influence their next query; this is especially true in digital libraries, and in OLAP applications. Such interactive searching is enhanced when the DBMS has the capability to summarize the results of user queries. Summaries of query results provide an overview of what a user has found; when query result sets are very large, these summaries become a critical component in assisting users to digest what they have found, and to form their next query.; This research investigates methods of supporting browsing queries, and in particular, the efficient generation of summaries of query result sets. We propose methods of generating such summaries, and examine the performance characteristics of the methods in theoretical and experimental evaluations. We first propose the Smart Index, which provides summaries of user query result sets in time that is proportional to the area enclosed by the user's query. We then present the Relative Prefix Sum Method, which extends data cube techniques to the problem of browsing, and which provides summaries of user query result sets in constant time irrespective of query area. We develop the Dynamic Data Cube, which is the first data structure to provide sublinear performance for both queries and updates, and which supports clustered data and dynamic growth of the data space. We use data cubes in the context of browsing large information repositories, such as digital libraries. Within this context, we propose a method of mapping classification hierarchies to an integer domain so that they may be incorporated into a multidimensional data cube. Our methods thus provide system support for simultaneously browsing along multiple numeric and classification-based dimensions.
Keywords/Search Tags:Data, Browsing, Query, User, Digital, Methods
Related items