Research On Database Discovery And Selection In The Deep Web

Posted on:2010-04-22

Degree:Master

Type:Thesis

Country:China

Candidate:N Zhao

Full Text:PDF

GTID:2178360278472620

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the growing maturity of network technology, the rapid development of Web makes itself is becoming a huge and heterogeneous data repository. According to the depth of data stored in Web, Web can be divided into two parts, Surface Web and Deep Web. The Deep Web is believed to contain 550 times more data than the Surface Web, and its capacity is increasing rapidly. However, the Web databases can only be accessed through the Web query interfaces provided by them. As a result, the information in Deep Web cannot be indexed by traditional search engines, such as Google, Yahoo, etc. In order to access and utilize the information in it effectively and efficiently, we need to integrate the data. Due to the large scale of Deep Web, how to improve the integration efficiency is becoming a very hot research topic in the database field.This paper takes the Deep Web data integration system as the target application. Facing the heterogeneous myriads of data in the Deep Web, we mainly focus on how to improve integration efficiency during the process of the database discovery and selection. The main works include the two aspects.Query Routing: Many recent research efforts in the direction of data integration focus on "domain-based" integration issues. In order to reduce the number of data sources, we need to find the sources in relevance to the user requirements. This paper introduces a source selection system based on attribute co-occurrence framework for ranking and selecting Web sources.Increment-Based Random Walk Sampling: Selecting appropriate Web databases to submit query also can improve the integration efficiency. An increment-based approach INC-HIDDEN-DB-SAMPLER which improved HIDDEN-DB-SAMPLER is proposed to deal with all attributes on the query interface. A set of records as the samples are obtained from the Web database.This paper first introduces a "domain-based" data integration framework, and focuses on the database discovery and selection to improve integration efficiency. This paper's subject is a broader technology in the current application areas. This is not only a research paper exploring the theory that the value of research, and is also of great value and practical significance.

Keywords/Search Tags:

Deep Web, data integration, database discovery, database selection

PDF Full Text Request

Related items

1	Study On Key Techniques Of Web Database Integration In The Deep Web
2	Data Based On More Than A Web Database Integration, Auto-surf Technology
3	The Implement Of The Data Integration Of The Heterogeneous Database On The Birth_Control System
4	Research On Database Discovery And Clustering Of Deep Web
5	Research And Implementation Of Police Synthetic Database And Data Integration Platform
6	Study And Implementation Of Data Integration Technology In "One-Stop Query System"
7	The Applied Research In Bibliographic Retrieval System Base On Knowledge Discovery In Database
8	Selection Of Deep Web Database
9	Study On Data Sources Discovery And Selection On Deep Web
10	Types And Selection Of The Integration Patterns Of Deep Web Information Resource