Font Size: a A A

New Approaches To Selectivity Estimation In Database Optimization

Posted on:2009-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:X LuFull Text:PDF
GTID:2178360275992282Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In databases system, an optimizer is the module that builds the plan of how a query will be executed. Selectivity estimation plays a key role in query optimization process of database. Inaccurate selectivity estimation leads to suboptimal plans. The execution times of an optimal and a sub-optimal plans may differ by orders of magnitude. This paper explores new approaches for selectivity estimation in database optimization.Histogram is the most popular technique of selectivity estimation. We propose a new approach to histogram construction. We use relative aggregate error to guide the construction of histogram. An algorithm is developed to build histograms by minimizing relative aggregate error. Experiments over both real-world and synthetic datasets are carried out to evaluate the performance of the proposed method.Histogram based approaches are effective only in relatively low data space; while sampling based approaches perform well in high dimension data space. Thus, we propose to combine histogram and sampling to exploit the advantages of the two approaches. Three hybrid methods are introduced. The first one samples data in every bucket with similar sampling rate. The second samples buckets with different sampling ratios. The third replaces every sample with a Guass distribution which is generated by EM algorithm. Performance of these approaches are evaluated extensively. The results show that our approaches are effective and scalable.
Keywords/Search Tags:Query Optimization, Selectivity Estimation, Histogram, Sampling, Hybrid method
PDF Full Text Request
Related items