Font Size: a A A

Research On Query Optimization For Deep Web

Posted on:2013-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:A M ZhangFull Text:PDF
GTID:2248330377958428Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid growth of the amount of information in the Deep Web, it has graduallybecome a popular job that the researchers research academically these informations containedin Deep Web. Because the most valuable data in Deep Web have been hidden behind thequery interface, at the same time, these data are heterogeneous, autonomy, etc., which makethe study of this problem encountered great challenges. In order to better use the informationof DeepWeb, the integration of Deep Web’s data is indispensable to meet quickly andefficiently the users retrieve the valuable data. The query optimization is one of the importantcomponents in Deep Web data integration system, and occupies an important position.The purpose of the deep web query optimization takes the output of the query planningmodule as the input of the query optimization module, furthermore, proposes some effectiveoptimization methods for these possible query plans aiming to select the TOP-1or TOP-Knumber of optimal solution, finally, and achieves the purpose of choosing the best solution.Currently the study of query optimization method based on Deep Web is in its infancy,most of the existing methods aimed at the optimization of processing a single query plan.Inthis paper, it only solves the problems of optimization selection from a variety of query plans,and proposes the intelligent caching optimization module and the plan cost assessmentmodule. Intelligent caching consists of the historical results of plans queries and data sourcequeries, as well some characteristics of the data source informations.Using the cachinginformations can preprocess the jobs of query optimization, if successful, will return theresults of preprocessing to users and finish optimization. Otherwise, the plan cost assessmentmodule will be used to solve the next step of the optimization. The plan cost assessmentmodule uses the skyline query to achieve, adopts three-dimensional principal componentsparameters to measure the query quality and query cost of every plan, which consists of thenumber of return record, accuracy and response time of queries. By the determine of thethree-dimensional principal components parameters and the skyline query algorithm,achieving the purpose of prioritization from a variety of plans, so as to provide the TOP-Kkinds of optimal query plan to the next step of query execution.By programming the optimization model, complete better the purpose of the system design, and achieve the expected results and performance to requirements. Compared with thenaive methods, demonstrated the optimization program in the TOP-K plans sequencingaccuracy and in the plan recall rate, the plan accuracy and the plan response are greatlyimproved more than the naive method, indicating specific good performance of theoptimization method.
Keywords/Search Tags:Deep Web, Query Optimization, Intelligent Caching, Plan Cost
PDF Full Text Request
Related items