Research On Association Rules Mining Methods Of Mass Engineering Data Based On Hadoop

Posted on:2017-03-12

Degree:Master

Type:Thesis

Country:China

Candidate:B Zhou

Full Text:PDF

GTID:2308330482979316

Subject:Mechanical Manufacturing and Automation

Abstract/Summary:

PDF Full Text Request

In recent years, with the rapid development of the high speed EMU train in China, massive amount of historical maintenance and fault data have been accumulated at present. How to make use of data mining technology to mine useful knowledge from historical maintenance and fault data and provide effective decision-making support for the Electric Multiple Units (EMU) fault diagnosis and maintenance, has become an urgent requirement of application. Aiming at utilizing EMU trainsâ€™historical maintenance and fault data, from the perspective of guiding the EMUâ€™s fault diagnosis, methods for association rules mining of mass engineering data have been researched in this thesis.The traditional association rules mining algorithms will meet the bottleneck in the process of data mining when dealing with mass and multi-dimensional data sets. In this thesis Hadoop is adopted as the basis technology to improve the traditional Frequent Pattern Growth (FP-Growth) algorithm and the traditional Apriori Algorithm to facilitate parallel data processing. Hadoop is an open-source distributed computing platform. The core parts of Hadoop are the Hadoop Distributed File System (HDFS) and the parallel programming framework-- MapReduce. Developers can conveniently develop distributed programs without understanding the inner architecture of Hadoop.In this thesis, the existing association rule mining algorithms and their disadvantages are analyzed. According to the requirements of EMU trainsâ€™ fault diagnosis, the FP-Growth algorithm and the Apriori algorithm are chosen as the basis algorithms to mine association rules of the EMU trainsâ€™mass fault data. Firstly, an improved algorithm for data mining is proposed by using the local frequent pattern tree instead of the global frequent pattern tree. This algorithm adopts parallel processing in every data processing steps. The frequent patterns search strategies are also improved. Secondly, an improved parallel Apriori multi-dimensional association rule mining algorithm is proposed, which uses the iterative method and realizes parallel processing in the process of mining candidate data sets. The efficiency in the process of mining association rules is greatly improved by the proposed algorithms and the computation space cost is saved effectively, and the mining results keep the relationships between the fault information and the state information well, and in the meanwhile it removes the invalid rules reasonably.In this thesis, the improved algorithms are used in the process of association rules knowledge acquisition, which are hidden in the EMU trainsâ€™historical maintenance and fault data. An EMU Trainsâ€™mass maintenance and fault data processing prototype platform is also designed and implemented accordingly, which includes the authentication module, the data transmission module, the data mining module and the file management module, etc. Based on analysis and experimental tests, the improved parallel algorithms proposed in this thesis have the characteristics of fast speed, high efficiency and accuracy in the process of knowledge acquisition for EMUâ€™s fault diagnosis.

Keywords/Search Tags:

Association rule, Data mining, Parallel FP-Growth algorithm, Parallel Apriori algorithm, Hadoop

PDF Full Text Request

Related items

1	Research On Association Rules Algorithm Based On Hadoop
2	Algorithm Design And Implementation Of Multi-core Parallel Association Rule Mining Environment
3	Research And Improvement Of Association Rules Mining Algorithm Based On Directed Graph
4	Research On A Parallel Data Mining Algorithm Apriori
5	Research On Parallel Association Rule Mining Algorithm Based On Hadoop Platform
6	Research On Parallel Acceleration Algorithm Of Association Rules Based On Hadoop
7	Improvement And Parallel Processing Of Association Rules Algorithm On Data Mining
8	Association Rule Of Borrowed Books
9	Research And Application Of Two Improved Association Rules Mining Algorithm
10	Research On Association Rule Mining Based On Adaptive Algorithm And Parallel Computing