Font Size: a A A

An Optimization Of Decision Tree Based On Parallelization And Applied Research

Posted on:2016-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LongFull Text:PDF
GTID:2308330470467689Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of wireless communication and MEMS technology, a variety of sensor data have a exponential growth. These massive data not only brings challenges to the traditional database storage and calculation of the processor, but also has brought new development opportunities to the field of data mining. In the era of mass data, accurate segmentation of data attribute and parallelization of data mining algorithm are becoming more and more important to get the right results.Classification algorithms in data mining are used in all aspects,decision tree classifier is a very important classifier in classification algorithm.It has the advantages of easy to understand, the high accuracy rate, not requiring domain knowledge etc.In order to improve the efficiency of decision tree algorithm, reduce the structural complexity of the decision tree,a parallel optimization algorithm is proposed in view of that the traditional decision tree algorithm is not suitable for the characteristics of massive data processing in this paper. Compared with the traditional decision tree algorithm,optimizes the selection of the division point in the discretization of continuous attributes, while improves the implementation of parallel property segmentation. In the process of decision tree construction, uses parallel design that makes the decision tree with the MapReduce programming model. It proposed a new parallelization scheme not only reduces the time complexity of the algorithm and improves the lack of scalability problem of decision tree, but also simplifies the parallelization design process. This paper will also post parallel implementation for the pruning algorithm to speed up the process of the decision tree pruning. Finally, based on the above theoretical research, a data mining platform based on household appliances energy monitoring in the view of the home air conditioning is designed,to help users effectively understand the energyconsumption.
Keywords/Search Tags:Parallel, Decision tree MapReduce, Monitoring of energy consumption, Household appliances
PDF Full Text Request
Related items