Font Size: a A A

Study On Query Relaxation For The Deep Web

Posted on:2009-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y MaFull Text:PDF
GTID:2178360308979284Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the development of information technologies, information on Web is growing rapidly. According to the depth of the information, the Web can be divided into two categories: Surface Web and Deep Web, where the later one refers to data sources that are stored in the databases and can not be accessed by hyper-links but only by dynamic web page accessing. The information on the deep web is much more than the surface web, so it is better to make the best use of it. However, in the process of query, it is hard to avoid the so-called failed queries that make no result. It is more cooperative to modify the raw query in order to return non-empty result set than to notify the user that there is no result corresponding to the query.Inspired by the observations on the deep web, the paper presents a query relaxation solution for the deep web. First, it applies the query probing to obtain the data sample from the underlying deep web database, and basing on these data sample, the important degrees of the attributes are obtained by employing the approximate functional dependence. Then, it transforms the global database relationship graph (DRG) to a DRG that fits the query. Finally based on this DRG the query is relaxed and executed.Because of query relaxation the amount of the results from the data sources may be very large, and part of them may be not similar to the user's query. Therefore after receiving the results from the data sources, we first select a part of the results using skyline method, and then sort them based on the similarities between the results and the user's query. Finally the results which are supposed to satisfy the user's requirement are returned to the user.Basing on the methods of query relaxation and result filter which have been mentioned before, we implement the search subsystem of the deep web search engine DWSearch. Considering the problem of concurrent access, the search subsystem is implemented with distributed structure. On the DWSearch system we conducted the experiments to demonstrate the availability of the query relaxation and result filter methods and the ability of the search subsystem to deal with the concurrent access.
Keywords/Search Tags:Deep Web, query relaxation, database relationship graph, result filter, Skyline
PDF Full Text Request
Related items