Font Size: a A A

The Research Of Data Mining Association Rules Based On Web Service

Posted on:2013-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:B TanFull Text:PDF
GTID:2218330374961181Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the development of recent20years, the World Wide Web has quickly developedand maked itself become the world's largest public data resource. More and moreenterprises are developing their own information construction, they have significantlyimproved their efficiency and benefits. Large-scale e-commerce sites and socialnetworks are also springing up, and have accumulated large amounts of business data,some of which are distributed in different locations, in the face of huge-scale data, notonly ordinary Web users tend to get lost in a sea of information, but also enterprises donot know whether it can create value. Therefore, data exchanging and data mininghave already become a hot research issue in recent years.In data exchanging, the article describes a Service-Oriented Architecture (SOA),and concludes its architecture level and web service base, on the basis of the work, thearticle has studied service-oriented design pattern and the message exchange pattern, thearticle also has proposed a study of MEP's reliability and idem potency which uses thekarn adaptive algorithm to achieve a retransmission mechanism, it also uses databasetechnology to achieve news persistent and the design approach of correlation ID toachieve the news' idem potency.In data mining, the article uses quantitative association rules to analyze the dataline by line, and the algorithm is extended to the Hadoop distributed computingplatform, it has designed a quantitative association rules based on the MapReducealgorithm. In continuous attribute values discretizing of the data preprocessing, it usesK-means algorithm, and it determines the K's value by basing on statistics of datadistribution. When mining K-subset in mining frequent item sets process, thecombination of algorithms can generate a k-subset, then we use MapReduce tocompute models, and count each sub-items on distributed platforms, then we canobtain the support of each candidate items, and prunes according to the minimumsupport and minimum confidence. If no change happens on frequent item sets, thealgorithm will terminates, and we can get the confidence of each rule, cut less than theminimum confidence of the rules when compared with it, finally we can get theassociation rules.Finally, the article has designed two experiments to verify the feasibility of theproposed algorithm and the effect of improved algorithm. Although the RTO ofretransmission and idempotent mechanism is15-20milliseconds slightly higher than thenon-retransmission and non-idempotent mechanism, but the retransmission, idempotentmechanism's reliability cannot be replaced by other mechanism, especially in thebusiness of high reliability requirements, such as online banking, stock trading, it hasvery important significance. In the experiments based on the quantitative associationrules of the MapReduce algorithm, the article has achieved K-means algorithm, thecombination algorithm and the improved quantity association rules algorithm, and then contrasts with the traditional Quantitative Association Rules algorithm in terms ofexecution time, memory usage, and transaction attributes which have already beenimproved.
Keywords/Search Tags:MEP, Hadoop, MapReduce, K-means, Quantitative Association Rules
PDF Full Text Request
Related items