Font Size: a A A

Analysis And Application Of ZengLiBao Data Based On Hadoop Platform

Posted on:2016-09-15Degree:MasterType:Thesis
Country:ChinaCandidate:T HuangFull Text:PDF
GTID:2428330482481299Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
Financial enterprise data volume growth with the construction of Internet technology matures and corporate information industry becomes increasingly incalculable.Faced with growing financial data,how to store high-value processed and analyzed financial data will gradually become an important and difficult issue facing enterprises.With the Yu'eBao launch Thfund's ZengLi Bao products and a significant increase of customers,Celestica fund also promoted to the first large fund companies,now we need to increase these ZengLiBao store data effectively and accurately from the customer data analysis dig out valuable information,Because large customer transactions,customer information complexity,it is not easy to handle.Data storage platform currently used mainly Oracle,SQL Server,MySql and so on.But in the face of excessive growth of the amount of data increases Liberty data,these platforms will no longer apply,this article will be selected as a storage computing platform Hadoop by ZengLiBao data through this platform for data analysis and mining.Firstly,the theory of knowledge,introduces Hadoop related sub-frame,including distributed storage system HDFS,parallel computing framework MapReduce,distributed data warehouse Hive,K-means algorithm and optimization principle,for the later Hadoop technology and optimization K-means algorithm by ZengLiBao data analysis using theoretical groundwork done.Then,the author of Hadoop technology in data analysis project by ZengLiBao in the background of practice,built a peak of 50 million contains test data,three physical machines as Hadoop cluster nodes,and by optimizing the use of K-means clustering after 50 iterations algorithm to analyze the process by Liberty data.Finally,by comparing the data when dealing with the same amount of data and time efficiency of traditional platform Hadoop platform to reflect Hadoop advantage when dealing with large data.
Keywords/Search Tags:Hadoop, MapReduce, Hive, K-means, ZengLiBao
PDF Full Text Request
Related items