Font Size: a A A

Research And Implementation Of Mining Association Rules For EMU Failure Data Based On Hadoop

Posted on:2016-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:H HuFull Text:PDF
GTID:2308330470955798Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of high speed railway, we have accumulated vast amounts of service data of high-speed train after ten years’operation, and these data have been increased with a speed of TB. How to analyze the massive failure data of EMU and its further maintenance service is important to its fault diagnoses. With characteristics of diversity, body volume, high complexity and high speed shown in service data of high speed EMU, traditional data mining methods, which is time-consuming and inefficient with poor real-time capability, can’t meet the needs of EMU in handling malfunction and emergency applications. And this thesis explores analysis technologies and methods based on Hadoop, which will be used in diagnosing EMU’s malfunction.This thesis provides solutions to diagnosing EMU’s malfunction on the base of distributed computing framework Hadoop and improves the algorithm of Apriori on the base of the popular association rules mining algorithms of Hadoop, and it is conducive to increasing association rules mining efficiency of EMU’s failure data, which has been proved in the application.Several points of this thesis are as follows.Firstly, it has analyzed the core technologies of Hadoop including distributed computing framework of MapReduce, distributed file system HDFS, and data warehouse Hive, and has provided solutions to the big data of analyzing EMU’s malfunction based on Hadoop. Besides, it has set up Hadoop cluster, and the failure data set of EMU has done the job of data cleaning by the use of data warehouse of Hive.Secondly, it has brought out parallel algorithm optimization of Apriori based on Hadoop and an improved algorithm MRAprioriT based on iterative computations of MapReduce. And it has proved that the improved algorithm has increased by about36%in terms of speed, which can meet real-time requests in fault diagnoses of EMU.Thirdly, the improved MRAprioriT has been applied in actual scene of EMU’s malfunction in the laboratory, which reflects association rules mining system of EMU’s failure data based on Hadoop.The EMU’s data mining system designed in this thesis has met the specific needs, for it has good concurrent mining performance and can raise the efficiency of EMU’s failure data analysis.
Keywords/Search Tags:Big Data, EMU, Data Cleaning, Data Mining, Hadoop, AprioriAlgorithm
PDF Full Text Request
Related items