Font Size: a A A

Rough Association Rules Algorithm Research With Big Data

Posted on:2015-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y L MiFull Text:PDF
GTID:2298330467969879Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
Data is a very important resource for any organizations, and with the development of information technology, data volumes are growing far more than any time before. However, the resources of data are so different from other related resources that if the data resources cannot be full used, which do not bring benefit but a burden for an organization. Now, varies technologies of data mining play an important role in how to deal with these important data resources.The traditional patterns and data mining algorithms of various data are mainly for the analysis and mining of small amount of data and structured data. And for the big data and unstructured data, the researchers should rethink the original algorithms and models so that the patterns of data mining can better adapt to the characteristics of data in big data.In the data mining patterns, the rough set theory is a kind of tool, which is used to deal with uncertainly and fuzziness of knowledge data; and now it plays very important roles in artificial intelligence, pattern recognition, decision analysis and other important fields. Through the analysis of the traditional Apriori algorithm of association rules, we find that the traditional Apriori algorithm not only can scan the transaction database for many times, and at the same time also will produce a very large number of candidate item sets when it processes the data. Therefore, it is not suitable for processing large data and unstructured data. The parallel association rules algorithm, which have be put forward by many scholars, can only find the certain relations behind frequent transactions and can’t mine the negative relationship. However, from the practical application, the negative relationships have the same important with the positive relationships.To this end, this paper in-depth analyzing the characteristics of transaction databases and combining with the principle of Boolean Matrix, applying classification of the rough set and MapReduce parallel programming model, an algorithm for the rough association rules with negation using MapReduce is put forward to deal with negative relation of the massive data.Theoretical analysis and experimental results demonstrate that the proposed parallel algorithm not only improve the effectiveness of exist parallel algorithm, but also reveal the negative relationships behind the massive data. It is a beneficial attempt for rough set theory in dealing with big data.
Keywords/Search Tags:Massive Data, Data Mining, MapReduce, Apriori Algorithm, Rough AssociationRules with Negation, Rough Sets
PDF Full Text Request
Related items