Font Size: a A A

Research On High Performance Optimization Algorithms Of Database Query Based On CMP

Posted on:2013-02-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y H ChenFull Text:PDF
GTID:1118330371982692Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
CMP improves the computing power and storage space for the computer,increases thecommunication speed and bandwidth of chip. CMP provides the foundation for parallelcomputing. Moreover, the cost and consumption of CMP provide the conditions for itspopularity. Due to the structural features of the multicore architecture,Processormanufacturers take advantage of improvements in manufacturing technology to implementtwo or more"execution cores"within a single processor. Every execution core, in essence, isa relatively simple microprocessor of single-threaded or Multi-threaded microprocessor.Multi-core processors embed two or more independent execution cores into a single processorpackage. By providing multiple execution cores, each sequence of instructions, or thread, hasa hardware execution environment entirely to itself. This enables each thread run in a trulyparallel manner. so Multi-core processors has the high level of thread parallelism. In thecomputer industry, the development of hardware and software are usually complement oneanother, promotes mutually. The revolution of hardware with multi-core processors alsocontributed to the revolution of software. A large number of software design personnel use theMulti-core processors as the realization platform of parallel program design.In the relational database setting today, large queries containing many joins arebecoming increasingly common. The processing power of traditional database system basedon the platform of single processor and single computer can not meet today's requirement.How best can we correctly and effectively gain the information of complex queries, improvethe processing ability of database query? This question becomes an urgent problem, and animportant research direction of Database management technology. Chip Multi-Processor(CMP) present new opportunities for improving database performance on large queries. Theoptimizer of database management system combined with high performance running platformand the higher level thread parallelism of Multi-core processors can resolve the problemresulted from database rapid expansion. This becomes a new database research branch.Database query optimization research mainly is the optimization problem of more joinsoperation of complex relation database. Based on Multi-core processors platform and parallelcomputing technique,the research of database query optimization can be divided into twosteps. The first step uses parallel technology to produce sequential query plan. The secondstep applies the parallelization method to distinguish the unit executed in parallel in sequentialquery plan, constructs efficiently parallel query execution plan. According to these steps, the research of database parallel query optimization is divided into three aspects: the query plan,the parallel query execution plan and the parallel algorithm of operation.For the research of query plan, this paper firstly proposed the optimization constructingalgorithm of join pairs based on the bottom up enumeration algorithm, and proves rationalityof this algorithm. Based on the constructed set of effectively join pairs, this paper achievedthe parallel optimization for bottom up enumeration algorithm. The top-down algorithmbegins with a group consisting entirely of node, then considers generate all candidate logicallyequivalent multi-expression. This processing is called as logical-to-logical transformations.The traditional strategy relies on transformation rules, which do not consider the query graphto generate logical join pairs. Since the enumeration of this method is very fast, this is a veryefficient strategy if the search space is dense, e.g. for clique queries. However, if the searchspace is spare, e.g. for chain queries, this method will product many logical join pairs whichare not connected and, therefore, are not relevant for the solution. In order to resolve theseproblems, the paper improved the logical-to-logical transformations performance. Moreover,the paper proposed an algorithm to realize top-down join enumeration methods withnon-inner join predicates, extends the function of optimizer.The question of the query plan's construction is optimized combination. As the numberof join operation increasing exponentially, the search space is radical expansion. Dynamicprogramming methods, regardless of Top-down or not, face a difficult for complex queriesbecause of its inherent exponential nature owing to the explosive search space. When thenumber of relation of query outnumbers ten, the achievement based on the dynamicprogramming enumeration algorithm realizes the impossible. In order to improve theefficiency of the query processor, this paper further combined the measurement of similarsub-queries with the constructing connected join pairs. In order to take advantage ofmulti-cores architecture, a comprehensive and practical framework for parallelizing top-downdynamic programming query optimization is further been proposed to achieve partialsolutions among the identified similar sub-queries and global execution plan among theconstructed connected join pairs according to the generated partial similar sub-query plans.This method of the measurement of similar sub-queries reduces the search space, improvesthe utilization rate of storage space, and relieves the memory access latency.For the search of parallel execution plan, this paper based on the granularity of queryplan put forward optimization methods of two levels, i.e. the optimal execution strategy ofquery plan and the operation algorithm of the executive strategy. Because the resources ofCMP are shared by multiple threads, CMP system essentially is different frommulti-processors and concurrent multithreaded system (SMT). How to achieve themulti-threading execution plan, obtain the biggest resource utilization rate, are the key of thispart. This paper firstly researches how to construct the optimal execution strategy of queryplan based on CMP system. Based on the constructed pipeline executive tree, this paper analysises the parallel execution strategy of pipeline, and allots every basic operation ofincomplete pipeline executive tree among threads. This paper considered the workload ofoperation in the process of distribution, and realizes the optimal distribution of threadsworkload and pipeline buffers. Finally, this paper realized further optimization algorithmthrough parallel buffer. The high efficiency algorithms are always the concerns of importantissue by database researchers. This paper presented a new multithreaded cluster partitionstrategy according to thread allocation scheme of the partitioner, and optimized more accurateestimates for ideal partitioning fanout to reduce the cache access conflict in the multithreadedjoin phase. In the second phase, based on the radix-clustered hash tables, this paper proposeda novel multithreaded join execution strategy to achieve the workload-balancing assignmentamong threads as well as hide mostly cache miss and conflict of L2-cache.
Keywords/Search Tags:Chip multi-processor, parallel query processing, optimization algorithm, datadecomposition, data dependence
PDF Full Text Request
Related items