Font Size: a A A

Constructing virtual databases on the World-Wide Web

Posted on:2002-08-16Degree:Ph.DType:Thesis
University:Stanford UniversityCandidate:Rajaraman, AnandFull Text:PDF
GTID:2468390014950274Subject:Computer Science
Abstract/Summary:
The World-Wide Web contains a wealth of information on every imaginable topic. At the same time, relational database systems form the backbone of most commercial applications. Such applications are unable to make use of data on the Internet or corporate Intranets. To bridge this divide, we propose making web information available through a relational interface. We call a database that provides access to information from external data sources (such as the World-Wide Web) a virtual database, since the tables in it correspond to information that is not physically stored in the DBMS.; This thesis discusses research problems that arise in constructing virtual databases. These problems fall into two categories: (1) Providing access to individual websites (or other data sources) as virtual tables. (2) Integrating such virtual tables into a unified database, taking into account access restrictions on the virtual tables caused by query capability restrictions of the underlying data sources.; We describe two techniques to solve the first problem: compact skeletons and the full disjunction. A compact skeleton maps a portion of a website to a virtual table. When we have several virtual table fragments in a website that need to be composed into “wider” virtual table, we use the full disjunction. To tackle the second problem, we describe three formalisms to model the contents and query capabilities of data sources: views, query templates with binding patterns, and limited external query processors.; Many of the ideas in the thesis have been incorporated into Junglee Corp's Virtual Database Management System (VDBMS), which has been used to build virtual databases that integrate several hundred websites for applications such as comparison shopping and online recruitment. We describe the architecture of the VDBMS and discuss some of the issues involved in building virtual databases that integrate hundreds of data sources.
Keywords/Search Tags:Data, Virtual, World-wide, Web, Information
Related items