Font Size: a A A

Research On Management Of Logistics Massive Data And Its Application In Cloud Environment

Posted on:2015-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y PanFull Text:PDF
GTID:2298330467472389Subject:Logistics engineering
Abstract/Summary:PDF Full Text Request
In recent years, Internet, mobile Internet and Internet of things have been rapid development.increasing the number of Internet users also makes the increasing amount of data. A single loadcapacity of the machine cannot have good store huge amounts of data, now how to build large scale,high efficiency and good scalability storage system is particularly important. Cloud computing hasbecome a focus of research, and derived a cloud storage, cloud computing also started to cloudstorage at home and abroad made in-depth research. Study of cloud computing and cloud storagestandard reference model is based on the Google File System of the open source implementation ofHDFS Hadoop File System, but there are a lot of shortcomings, Outstanding is a single NameNodeeasy to cause the entire cluster of performance bottlenecks. In this paper, based on the existingresearch of HDFS, proposed a based on directing the NameNode solution, the solution can be a verygood solve the HDFS single NameNode performance bottlenecks. Experiments show that thisscheme can expand on HDFS cluster namespace.At the same time, With the development of social large logistics enterprises,how to dig outthe useful information from these massive amounts of information has become the key to theresearch in this field. Cloud computing has the ability to calculate the flexibility, storage capacity ofquantitative, cost savings, improve efficiency etc, therefore, cloud computing has become aneffective one method of dealing with the problems faced by data mining technology. This paperfrom the two aspects of analysis graphs programming model and Hadoop platform, then dive intothe Mahout, detailed the Mahout internal data representation model and makes further discussion,the K-Means algorithm, parallel analysis was carried out on the K-Means algorithm, detailedelaborated the K-Means clustering in graphs programming realization and application in Mahout.Finally, focuses on the specific situation of the logistics industry in our country, put forward theparallel and serial two modes of data mining, mainly for K-Means algorithm in both cases thecomparison of efficiency of solving the problem of huge amounts of data mining, this article fromthe different distance measure, running time and number of iterations, etc, to assess the K-Meansalgorithm clustering results, and finally found the efficiency difference, can have very goodguidance of huge amounts of data mining.This paper based on directing the NameNode HDFS cloud storage technology and K-Meansalgorithm based on graphs programming model and introduces the data mining technology very good deal with the logistics industry of information storage and computing problem, by calling theHDFS to store huge amounts of data, abundant data with the upper Mahout parallel data mining,digging out the useful information for the logistics industry.
Keywords/Search Tags:Huge amounts of data, Cloud storage. The distributed file system, Hadoop.Analysis of the logistics, K-Means
PDF Full Text Request
Related items