Font Size: a A A

Research And Application On Cluster Analysis For Quality Data Of EMU

Posted on:2017-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:H Y LuanFull Text:PDF
GTID:2308330485960362Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The quality of the parts of EMU directly affects the maintenance efficiency, operation cost and operation safety of EMU. In recent years, the domestic high-speed rail industry repeatedly appeared quality problem. China still takes the way of artificial experience when trace back to the root of problem. This way will test staffs’ quality and experience, and it’s lack of big data research as a theoretical support, so it’s too subjective. In addition, it’s also in a passive state about the forecast and targeted quality maintenance work, there is no corresponding response mechanism. Driving state datas are a true reflection of the real time state of parts quality. from this, we can find the factors that make quality abnormal. It’s already more mature of the construction of the full life data integration management platform of EMU, and has accumulated a large number of parts data. However, due to the huge amount of data, it’s impossible to discover the potential relationship between the data attributes by hand. So it is very necessary to study the influence factors of quality based on big data mining.Clustering analysis will divide massive data into parts by similarity, which is more advantageous for users to analyze, and in this paper, the data type is diverse, high dimension and large amount of data etc., so we select Chameleon clustering algorithm, and in order to improve the quality of clustering, we also improve the shortcomings of the algorithm in the analysis of industrial data. Due to the large amount of data, the traditional single machine operating speed is too slow,it’s can’t meet the needs of analysis, so the improved algorithm is achieved parallelization based on Hadoop platform. the main work has the following aspects.(1) This paper improves M-Chameleon algorithm, which makes up the problems of low efficiency and the difference between subcluster density what can influence the cluster quality, the improved scheme is put forward. The test data sets show that the improved algorithm is of high quality and fast time.(2) Considering the problem of a huge amount of quality data, this paper designs and implements the parallelization of clustering algorithm based on Hadoop platform. It is capable of handling a large number of data processing tasks.(3) Based on the result of cluster of the quality data, this paper analyzes and locates the root of the problem, and put forward the quality traceability table based on the history data, which is the basis to realize the quality prediction work.Based on the principle of statistics, this paper digs out the potential relationship between the attributes from the massive data, which assists the high-speed railway industry to achieve the location and forecast of quality problems. It has a positive effect on the development of Railway Technology.
Keywords/Search Tags:quality of parts, EMU, Chameleon, Hadoop, locating quality problems
PDF Full Text Request
Related items