Font Size: a A A

Parallel Database Query Optimization Genetic Algorithm

Posted on:2005-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:P XuanFull Text:PDF
GTID:2208360125467718Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Parallel query optimization is the critical technique of parallel database system. At present, the researches of query optimization mostly concentrate on the optimization of complex relation database query with multi-join operator. In the decade, there has been much research into parallel multi-join query optimization. However, there are few multi-join query optimization methods based on pc clusters now. This paper mainly investigates the key technologies of relation distribution, multi-join query optimization and parallel query processing based on pc clusters.The parallel computer system based on pc clusters is an important branch of parallel processing techniques,and it is an efficient way of parallel computing, thus it would dominate the development of parallel computing techniques. On the base of refering to the characteristics of parallel pc-cluster system, this paper presents new algorithms of selection relation distribution, parallel multi-join query optimization and query processing.In the pc-cluster system,the bandwidth of network communication is always the bottleneck of that system. The algorithm proposed by this paper sufficiently takes the factors which cause the data redistribution of pc-cluster system into consideration, and reduces the cost of additional communication. At the same time, it also takes into account the parallelisms of intra-operator, independent inter-operator and pipelined inter-operator, so that we can exploit the parallelisms of pc-cluster system.The multi-join query optimization is one of the the critical problem of parallel query optimization.This paper proposes an algolrithm that researchs the effect and method of using resource allocation in the estimation of the cost of query execution plan and improves the precision of computing the cost of query plan, thus it can ensure the quality of the result of queryoptimization algorithm. In addition, the cost model of algorithm considers the cost of network communication, and it makes good use of the storage distribution information of all relations, thus it would reduce the needless cost of communication.At the aspect of parallel query processing, this paper realizes the algorithm of data redistribution among processors, the algorithm of join operator and the algorithm of pipelined execution, and it is based on the distributed component object model. Thus the query processing exploits all three types of parallelism of query execution, and it improves the execution efficiency of the parallel multi-join query.Through the simulation experiment of greedy algorithm based on getting the least middle relation result, the heuristic algorithm algorithm based on right-deep tree and the algorithm proposed by this paper, we compares the performance of the three algorithms. The experiment results show that the algorithm considerably improves the parallel multi-join query optimization efficiency of pc-cluster systems, provides an efficient way in solving multi-join query optimization and plays an important role in improving the performance of parallel database systems.
Keywords/Search Tags:Genetic Algorithm, Parallel Database, Multi-join Query, Parallel Query Optimization, PC Clusters.
PDF Full Text Request
Related items