Font Size: a A A

Parallel Exhaustive Pivot Selection In Metric Space

Posted on:2020-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z L HuFull Text:PDF
GTID:2428330599454710Subject:Master of Software Engineering
Abstract/Summary:PDF Full Text Request
To handle the variety of big data,a basic idea is to first define a universal abstraction that covers a wide range of data types,and then build a universal system for universal abstraction.According to the Genhierarchy of Big Data,Metric space has been proposed as a universal abstraction for big data.Since there are no coordinates in metric space,one usually first pick a number of reference points,pivots,and consider the distances from a data point to the pivots as its coordinates.Aiming at the problem of metric space pivot selection that has not been broken through for a long time,this thesis uses exhaustive method to obtain the performance upper bound of pivot selection,which provides data support for revealing the improvement space of existing pivots selection algorithm and finding the patterns and features of excellent pivots.However,the time complexity of the exhaustive method of selecting k pivots from n data is often as high as O(nk+2).With the increase of n and k,the calculation cost increases exponentially.In order to expand the range of n and k that can be accurately calculated,the exhaustive algorithm needs to be optimized.The research in this thesis includes the following three aspects:?1?A fast exhaustive pivot selection algorithm based on pivots set overlap and high priority of pivots set is proposed.Through MPI and CUDA technology,the parallel design of exhaustive pivot selection algorithm is carried out.Experiments show that the optimized exhaustive pivot selection algorithm and CUDA parallel algorithm have speedup ratios of 38 and nearly 200,respectively.?2?Aiming at the load imbalanced problem of several hours,a load balancing strategy?WM-RNN?combining work-stealing and multi-stage Recurrent Neural Network is proposed.Experiments show that the WM-RNN load balancing strategy has more than 68%performance improvement compared with the baseline work-stealing strategy.?3?In order to solve the problems of low algorithm implementation efficiency and time-consuming manual program tuning,we proposed a HPV algorithm framework based on algo-rithmic description module and searching space.The commonly used algorithms are expressed in modular form.Combined with the concept of task overlap degree,a new adaptive function is proposed,and the genetic algorithm is used to generate the algorithm optimization solution.Experiments show that the HPV framework can generate an optimization scheme with perfor-mance close to fast exhaustive pivots selection algorithm in 2 hours.In this thesis,firstly,the operation efficiency of exhaustive pivot selection algorithm is greatly improved through algorithm optimization and parallel design.Then,the WM-RNN load balancing strategy is proposed,which effectively alleviates the load imbalance problem existing in the calculation process.Last but not least,HPV algorithm framework is proposed to reduce the algorithm implementation and run-time parameter tuning time.The research content of this thesis points out the direction for the application of the exhaustive pivots selectionalgorithm in metric space on large data sets.
Keywords/Search Tags:Metric Space, Pivot Space, Pivot Selection, Parallel Computing, Load Balancing, Neural Network, Algorithm Framework
PDF Full Text Request
Related items