Font Size: a A A

Research On Data Privacy Protection Based On Classification Mining

Posted on:2018-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:J LiaoFull Text:PDF
GTID:2348330536988534Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the network data information era,data mining technology can find useful information for people from a large amount of data,so that researchers can analyze this information.But data mining can generate useful knowledge and bring convenience for people in real life,at the same time the user's sensitive privacy information inevitably appears leak.Although there are a lot of methods that have been proposed to protect the information of classified privacy,no matter what method was used for privacy protection,it destroyed the originality of users' information in different degrees.Therefore,the problem of how to ensure the user's privacy information not be leaked in the protection at the same time and guarantee to maximize the classification of data availability has been one of the hot research topics of the field of data mining privacy protection in recent years.Firstly,this thesis summarizes the research background,significance and research status of privacy protection method in the classification data mining,and describes in detail on the related technology of classical model about K-anonymous processing and privacy anonymous classification mining;secondly,considering the different validity classification degree of different quasi identifier attributes to sensitive attributes,a classification anonymous algorithm for weight attribute entropy is proposed,the algorithm takes the concept of information entropy to measure the classification important degree of different quasi-identifier attributes to sensitive attributes,calculates the optimal entropy of weight attributes to favorably divide the classification data,and gives the privacy anonymous information loss metrics,according to the classification information and the anonymous loss information,and constructs the measure of classification anonymous protection in order to protect the data privacy and availability of classification.Once again,on the basis of ensuring the balance between the data security and usability,in order to the problem that the situation of property excessive anonymous caused by data quality loss,a classification privacy protection algorithm for attribute anonymous strategy is put forward.This algorithm combines anonymous generalization level with the strategy of the optimal partitioning attribute to reduce the loss of information caused by over anonymity,therefore,when the attribute is anonymous,the classification impurity index is used to select the optimal partition,and the anonymous classification is favorably carried out the numerical attribute and the classification attribute,and classified them into an appropriate metric,which reduces excessive anonymity and ensures the availability of classified data.Finally,the validity of the proposed two algorithms is verified by using the classical algorithms.The experimental results show that the two algorithms can better ensure the availability of data efficiently on the premise of protecting privacy sensitive information.
Keywords/Search Tags:classification mining, privacy protection, k-anonymity, entropy prope rty, classification impurity index
PDF Full Text Request
Related items