Font Size: a A A

Research On Privacy Calculation Of Cross-regional Gene Data

Posted on:2021-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:T LiFull Text:PDF
GTID:2518306050969099Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
With the rapid development of gene sequencing and computer technology,massive amounts of gene data have spewed out,becoming a valuable information resource in the field of biomedicine.In order to make full use of the great value contained in the mass gene data distributed in different regions,the effective method is that the owners of each gene bank carry out joint calculation and analysis of the gene data through data sharing.In this way,it can help cross-domain participants make more accurate analysis and judgment from a larger number and more types of genetic data,so as to achieve more valuable results and promote rapid progress in the whole field of genetic research.However,in reality,due to the unique privacy of genetic data,the sharing and joint analysis of genetic data have not been effectively implemented,and the research on this aspect is also extremely limited in domestic and foreign academic circles.At present,most research work on genetic computing is focused on improving the accuracy and operation efficiency of genetic calculation.A few work that considers data privacy and security mainly focus on the research of genome sequence alignment.Few multi-party cross-domain gene data sharing and joint computing are involved.Therefore,in view of how to achieve cross-regional shared computing and joint model training for genetic data privacy and security,this thesis mainly adopts secure multi-party computing,homomorphic encryption and proxy re-encryption to design twoparty cross-region joint execution Fisher exact test(FT)and linkage disequilibrium(LD)detection safe computing schemes,and then proposed the basic scheme and improvement scheme of multi-party cross-region joint execution model training for genetic data privacy security.The main research contents and achievements of this article are summarized as follows:In order to solve the two-party cross-domain gene calculation problem of data privacy security,this thesis uses the ABY development framework based on the secure computing hybrid protocol,Boolean sharing and circuit optimization strategy to design the FT inspection security algorithm and LD detection security algorithm which jointly executed by the cross-region two parties to protect the privacy of genetic data.Both security algorithms are based on the original Fisher exact test and linkage disequilibrium detection algorithm.In addition,these two safe gene detection algorithms not only realize the correlation judgment between gene data and non-random detection between alleles,but also protect the privacy of gene data during joint calculation.Finally,through experimental testing and performance analysis,it is proved that the two cross-domain two-party joint genetic safety detection algorithm designed in this thesis are feasible.Aiming at the problem of how to realize the cross-regional multi-user joint model training of genetic data privacy and security,this thesis first uses the secret shared secure multiparty computing technology to design the basic scheme of the multi-party joint training logistic regression model.This scheme not only realizes the need for joint model training of multiple gene bank owners,but also obtains a logistic regression model that can make good binary classification predictions for gene data.Moreover,the original genetic sample data is guaranteed to be private and safe during the whole model training process.However,since a large number of model parameter secret sharing work needs to be performed among multiple users in the basic scheme,the scheme generates a large communication overhead.For this reason,this thesis designs an improved scheme of cross-region multi-party joint training by combining homomorphic encryption and proxy re-encryption technology in the federated learning mode based on multiple distributed databases.Compared with the basic scheme,this scheme significantly reduces the communication overhead between users.In the end,the security proof and experimental analysis of the two schemes are carried out respectively,and the results show that the schemes proposed in this thesis are safe and efficient.
Keywords/Search Tags:Secure multi-party computing, Privacy protection, Federated learning, Homomorphic encryption, Proxy re-encryption
PDF Full Text Request
Related items