Font Size: a A A

Vehicle Routing Data Processing System Based On Hadoop And C4.5 Algorithm

Posted on:2018-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:X SunFull Text:PDF
GTID:2348330533959475Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the development of the national economy and the accelerated process of urbanization in China,the automobile as a daily necessities began to enter the tens of thousands of households.Now the car is equipped with Electronic Control Unit(ECU),ECU can collect a variety of sensor data,such as speed,accelerator pedal opening signal and so on.These data are transmitted to the data center through the car network and stored,the sensor data with large and unstructured data characteristics.This brings some difficulties to the storage and analysis of large data.How to store and analyze these data effectively becomes one of the important challenges faced by car networking enterprises.Cloud computing and the development of large data for a large number of car network data storage and analysis provides an opportunity.Based on the Hadoop large data processing platform and its ecosystem,the HBase distributed database is used to store the data of a large number of car network sensors.Based on the Map Reduce and the optimized C4.5 algorithm,the data of the vehicle network are analyzed efficiently.Work as follows:1.Based on the HBase bus network data management system design,the use of HBase distributed database on the sensor to collect the vehicle operating conditions parameters,including the design of the database;storage and query data interface function design;build a two-level index to achieve more Conditional query;and Hive integration to achieve the SQL engine;based on MapReduce to achieve data migration;development of the web-side data management system.2.According to the characteristics of C4.5 algorithm,Taylor's mean value theorem is used to simplify the attribute selection metric of C4.5 algorithm,avoid logarithmic operation,reduce the computational complexity of algorithm and improve the efficiency of algorithm.Based on MapReduce,C4.5 algorithm parallelization to achieve,to further improve the efficiency of the algorithm.The feature extraction of vehicle network data is carried out,and the C4.5 algorithm is used to classify the vehicle acceleration performance to generate decision tree classification rules to judge the acceleration performance.3.Build the system platform and test the system,based on Hadoop and HBase build test platform,HBase and SQL Server data operation performance comparison test;test feature extraction parallel operation efficiency;through the feature extraction data set to verify Optimized C4.5 algorithm for efficiency and accuracy.The results show that the efficiency of HBase in the system is obviously improved compared with that of SQL Server.The efficiency of digital feature extraction increases exponentially with the number of cluster nodes.Compared with the original C4.5 algorithm,The optimized C4.5 algorithm improves the efficiency of classification when the classification accuracy is not reduced.
Keywords/Search Tags:Hadoop, car networking, HBase, MapReduce, C4.5
PDF Full Text Request
Related items