Font Size: a A A

Research On Key Technology Of Optimization For Multi Join Based On Hadoop

Posted on:2017-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:M L YueFull Text:PDF
GTID:2348330503987198Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of society, the amount of data is increasing. But so much of the data is not always useful. So it is very important to get useful information from massive data. The analysis and processing of massive data has been paid more and more attention by people.Map-reduce framework is a distributed computing framework proposed by Google in 2004. Join operation is the basic operation of the database, the application is very extensive.The advantage of the two-table join on the Map-Reduce is that the logic is clear. But its disadvantages are also obvious. The two-table join produces a large number of intermediate results and causes a lot of disk I/O overhead. The replicated join can complete a multi table connection operation on a Map-Reduce task. It reduces disk I/O overhead. Due to The Replicated Join method needs to send a tuple to multiple Reduce.Therefore, this paper puts forward the method of combining the two table join and the replicated join. The method can include the advantages of the two table join method and the replicated join method. In a multi table join operation, an optimal execution plan for generating the two table join method for a multi table join operation is first performed. In this paper, ant colony algorithm is used to generate the optimal execution plan of the two table join method. In this paper, we construct the two fork tree according to the optimal execution plan. We determine the coverage of the replicated join method on the two fork tree, and get the best execution plan. Finally, we conducted experiments on cluster Hadoop. Experiments show that our method is effective...
Keywords/Search Tags:multi-table joins, two table join method, replicated join method, MapReduce, ant colony algorithm
PDF Full Text Request
Related items