Due to the rapid development of mobile internet,big data and other network technologies,people are generating more and more data in their daily life,which contains more or less personal privacy,such as personal location and online consumption.Cloud computing provides a storage platform for this data,allowing the potential value behind the data to be fully exploited,but when this data is outsourced to the public cloud,the data holder may face a huge risk of personal privacy being compromised.Therefore,during the data mining process,it is important to consider privacy predictions that protect the sensitive information of data holders and ensure that data processors comply with data security laws.Among the currently popular approaches,differential privacy has been widely used to protect sensitive information in data processing,but can still be a black-box operation for data holders,leading to distrust of data processors by data holders.To address the problem of distrust of data holders towards data processors during data processing,this paper aims to enhance the trust of data holders towards data processors by proposing a double privacy protection method,which first performs differential privacy protection on the statistical algorithm,and then generates synthetic data based on this method using plugin sampling.The regression coefficient matrix and covariance array are estimated accordingly based on the synthetic data in the context of a multiple response variable regression model.Theoretical results establish the distribution of specific estimators for the synthetic data,complete with two exact inference methods based on mean synthetic covariance(MSC)and on a combination of mean synthetic covariance and cross synthetic covariance(MSC_CSC),i.e.statistical tools for hypothesis testing are provided.This paper presents a simulation study of the two exact inference methods proposed under the double privacy protection,and finds that the estimated confidence region coverage probability of the regression coefficient matrix A is approximately equal to 0.94,which assesses the validity of the statistical inference and confirms that the double privacy protection method proposed in this paper can provide useful information for statistical analysis while protecting the privacy of the original data.Finally the paper discusses the application of the 2000 US current population survey public use data,showing that the proposed inference method is still valid and that the risk of privacy breach is lower than the single means of protecting synthetic data generated by the plug-in sampling method.This paper is fully structured,firstly demonstrating the scientific validity of double privacy protection at a theoretical level,with a preliminary conjecture that combining differential privacy with plug-in sampling will reduce the risk of privacy breach compared to single protection;secondly establishing two exact inference methods based on the likelihood principle;finally confirming the validity of the proposed statistical inference through simulations,showing that data processed with double privacy protection can still be statistically analysed,and evaluating the risk of privacy breach for real data,finding that it does provide a higher level of protection than single means. |