Font Size: a A A

Research On Decision Tree Algorithm For Privacy-Preserving

Posted on:2009-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:H CaoFull Text:PDF
GTID:2178360245956755Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Recently years, the data mining techniques are been used in the area of Finance, Medicine and so on. It instructs us in ours work with the mode and rules by using the data mining, at the same time it brings us some problems in our daily life. The most serious problem is the privacy exposure. As the data mining, privacy is including two aspects. One is the original data using to mine; another is the valuable rules mined from the original data.Now, the privacy-protected data mining is researched from two aspects. They are Randomization and Encryption methods. The first method is to change the original data. It is used in the central stored data. The second method is to encrypt the original data and the result of processing. It is used in the distributing data mining.This article presents an improved algorithm of decision tree. This algorithm uses original data to construct the "one-step transition probability matrix on attributes". And at the base of "one-step transition probability matrix on attributes", according to the attributes in the process of constructing the decision tree, the probability between the attributes can be calculated automatically. It makes the algorithm of building the decision tree does not have to get the probability from original data when calculating the entropy. And the algorithm improves the ending condition of building decision tree which don't stop constructing from using all of the attributes. So it has no influence on original data. The result of "probability matrix on attributes" will be different from the result of original data because the "probability matrix on attributes" is calculated from "one-step transition probability matrix on attributes". And the improvement of ending condition has some influence on the correctness of decision tree. But the experiment has proved that the final decision tree which has the difference will also have good precision. It can meet the needs of application. So that it can not only protect the original data effectively, but also not decrease the classified precision of Decision Tree.
Keywords/Search Tags:data mining, Privacy-preserving, decision tree, one-step transition probability matrix on attributes, probability matrix on attributes
PDF Full Text Request
Related items