Font Size: a A A

Data Mining Association Algorithm Research And Realization Based On Cloud Computing

Posted on:2014-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:G FengFull Text:PDF
GTID:2248330398994094Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Information technology is a sign of the new technological revolution, and enterthe new century,it is booming.At the same time,a large amount of data generated inthe imformation interaction.With continuous reseach and found that the data isactually hidden a lot of uesful information.But how to explore these difficulties likelooking for a needle in a haystack. Google launched Google101Plan, and the officialpresentation of the concepts and theories of the "cloud". Followed by Amazon,Microsoft, Hewlett-Packard, Yahoo, Intel, IBM have proposed a plan of their owncloud. Cloud appears as the sharing of resources, cost reduction provided forconvenience. It is fully in accordance with the needs of users assign the appropriateresources, including computing power, storage resources and application of technicalservices. Data mining technology allows possible to explore these "value" real,andLarge, random, fuzzy data related to treatment so that those hidden, closely linked tothe potentially valuable information to be revealed. It plays a good supporting role inscientific research and high-level decision-making in this process, and hasfar-reaching academic value and commercial value.With the surge in the amount of data, how to quickly and efficiently and cheapidentify “UW XU Y [[a Z “information.Therefore, people put forward a combinationof cloud computing and data mining. On the one hand,make full use of the clouddistributed, parallel, virtual technology, shared resources, the other hand, use themature technology of data mining. So we can improve the computationalefficiency,and ensure that the load balance, while also reduce the cost. Thecombination of these two play a good supporting role for researching and thedevelopment of high-level operators,so it has far-reaching academic value andcommercial value. In this paper, I participate Value-added Service General OperationPlatform of a company project for a communications carrier as the basis. We Researchand analysis related algorithm to find association rules and value information,and it plays a great role of data support in enterprise management and high-leveldecision-making.First we introduce the background of cloud computing and some commontechnology platform. Cloud computing has several services: Communications as aService (CaaS), Infrastructure as a Service (IaaS), Platform as a Service (PaaS),Software as a Service (SaaS). It focuses on the analysis of the Hadoop platformarchitecture. Following, we propose data mining technology, including the method ofprediction model, clustering, association analysis etc. We present data mining themain research directions and lead mining algorithms.Second we analysis sequence analysis algorithms, classification analysisalgorithms, association rules algorithm and focus on analysis of association rules.When we preliminarily describe the algorithm,we introduced business exposured to aChina communications during the internship. Specifically for business we analyzealgorithm described.Third,according to the traditional Apriori algorithm,we analyze the performanceof the algorithm and disadvantages. Propose several technical aspects of the improvedmethod including of basing on HASH, basing on sampling, basing on the division andbasing on incremental division. To improve the efficiency of the algorithm, twoimproved ideas are proposed.At last,build a database based on the model. The Apriori algorithm combine withHadoop platform,and design MapReduceApriori algorithm. The algorithm can makefull use of the HDFS distributed file system to store data, achieve parallel processingby the the MapReduce way,and find out frequent itemsets from massive data. Wemine and analysize the data of Phonebook Manager and get relevant results.We canmaster the status of business development by analysizing the results,and supply thecorresponding data support on business promotion,as well as management anddecision-making of Group.In short, theory and experiments show that data mining association algorithm playa very important role excavation of the value of data under cloud computing. Itproduces theory and economic significance for academic research and commercialoperations.
Keywords/Search Tags:Data Mining, Cloud computing, Data warehouse, Hadoop
PDF Full Text Request
Related items