Design And Implementation Of Intelligent Data Desensitization Middleware

Posted on:2020-06-07

Degree:Master

Type:Thesis

Country:China

Candidate:M C Wu

Full Text:PDF

GTID:2428330596487373

Subject:Engineering�Software Engineering

Abstract/Summary:

PDF Full Text Request

With the accelerating development of internet in all walks of life��mobile terminals deployed widely and devices of internet seen everywhere,internet has played a master role in people's daily life.Besides,people produce data all the time in their work,study and daily life.Some of these data can be digitized,and then,collected by various mobile terminal software,PC software and other internet devices.Nowadays,the amount of data increases exponentially over time,which brings about the big data industry flourishing �� On account of some significant values obtained by mining and analyzing extensive data,companies in various fields consider it important to develop big data-related businesses.It is no doubt that how to ensure the security of data privacy has turned into an inevitable matter for service providers and users in practice in the field of big data business.Data without desensitization is estimated to leak during a series of operations such as publishing,sharing and dissemination.Research on data encryption's and desensitization protection's models and algorithms has also attracted a large number of professionals at companies and the government to study.In view of the condition that there are many types of data and differences between databases and systems in storage structure forms,it is also distinct in leakage ways of different types of data and methods to avoid.In order to meet the demands of data for different purposes as well as to prevent sensitive data from being leaked and changing into useful "fake data" in the process of production,testing and sharing,the related technology of Intelligent Data Desensitization Middleware has come into being.According to data dependency existing between different data tables in structured data types,several desensitization model algorithms are designed to keep various types of data from attacking in this paper �� At present,what are widely used are the data desensitization algorithms,such as k-anonymity,l-diversity and t-closeness,as well as traditional data desensitization principles through transformation,generalization and concealment.In this paper,the original sensitive data will be desensitized in light of the rules of the algorithm library.Intelligent data desensitization consists of four parts: data source interface,desensitization strategy,data desensitization service and role data authority.Spark is also used for data fast processing.Intelligent Data Desensitization Middleware uses Shiro security framework to control user privileges.Different authorities for different data need to be managed and configured.The function of data source interface is to obtain data from different sources including HIVE,Spark SQL,My SQL,Oracle,and Data are processed and calculated by Spark.The traditional data desensitization methods and k-anonymity,l-diversity and t-closeness algorithms are encapsulated in the desensitization rules.

Keywords/Search Tags:

data desensitization, large data platform, k-anonymity, l-diversity, tcloseness

PDF Full Text Request

Related items

1	Research And Implementation Of Data Desensitization System For Preserving Statistical Characteristics Of Sensitive Data
2	Research On The Technology Of Data Desensitization In Trade Secret Protection
3	Research On Data Desensitization Based On Deep Learning
4	Research On Privacy-preserving Data Publishing Algorithms Based On Different Anonymity Requests
5	Research On Clustering Algorithm Of Data Table Anonymity
6	Research On Anonymity Models And Algorithms For Privacy-Preservation Data Publishing
7	Research On Privacy Preserving Data Publishing Based On Anonymity Models
8	Design And Implementation Of Data Transmission Platform Based On Kettle
9	Research Of Privacy Preserving Data Mining Techniques Under Anonymity
10	Research On Anonymity Algorithm For Incomplete Medical Data Based On L-diversity