| With the rapid development of the Internet,data has gradually become the key to innovation and development,and mastering data opens the door to the future of the Internet.As an important branch of artificial intelligence,machine learning technology is also maturing and developing.The training of machine learning models relies on massive amounts of data,and although large-scale data collection can improve the performance of machine learning applications,these data often contain a large amount of personal information,and once it is collected and shared,it cannot be effectively controlled,which will seriously threaten the privacy and security of data owners.Therefore,there is an urgent need for privacy-preserving machine learning solutions to protect data privacy.Logistic regression is well used in many fields due to its simple computational complexity and good model simplicity,making it ideal for building classification models in the encryption process.In this thesis,a secure privacy-preserving logistic regression scheme is proposed to address the privacy security issues resulting from the training of logistic regression models,with the following main work:1.A homomorphic encrypted multi-party logistic regression scheme for data quality is proposed to address the two challenges of privacy protection in multi-party logistic regression training and poor quality of training data.A gradient similarity metric is proposed in a distributed environment for filtering parameters from data contributors with poor data quality.By measuring gradient similarity,it is possible to identify datasets that may reverse the direction of updates and eliminate their associated parameters during global weight updates,thus maintaining the validity of the model.The entire dataset is split horizontally and shared between participants.Each participant trains a replica of the model using their own local dataset and uploads intermediate gradients to a central server to update the global model.Homomorphic encryption is used to address privacy protection issues and to improve model performance while maintaining data quality.2.A gradient-optimised logistic regression scheme for homomorphic encryption is proposed in order to reduce computational and communication overheads.The homomorphic computation of the gradient descent method is completed using the basic operations of approximate number arithmetic homomorphic encryption.The scheme includes processes such as sigmoid polynomial approximation function computation,weight vector update iteration and ciphertext inner product computation,using specific encoding methods to reduce the required storage space and optimise computation time.The Nesterov Accelerated Gradient(NAG)is enhanced by optimising the gradient variables,resulting in a significant improvement in convergence speed.Both the training model and the user data information are guaranteed not to be leaked during the whole process. |