Font Size: a A A

Research On Privacy Preserved Loan Default Model

Posted on:2021-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:P Z DangFull Text:PDF
GTID:2518306569990619Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the fast development of the Internet finance and credit card business,the number of borrower-defaults rapidly increases as well as the non-performing loans which greatly challenges those financial institutions.To cope with this issue,some financial institutions share their blacklist of default borrowers in an offline mode to reduce the non-performing loan ratio,but it has two major disadvantages.First,it cannot exchange the blacklist in real-time manner and thus may miss some default borrowers.Second,it cannot protect the privacy information of borrowers as well as the financial institutions.Therefore,a novel default model is needed to predict the default borrowers in a privacy-preserved manner.This thesis proposes a multi-party default model which is based on FastMap clustering technique.Traditional privacy protection methods process the original data through encryption or other security-based algorithms.However,most existing privacy protection methods do not work properly when the dimension of original data is getting higher and higher.The main content of this thesis can be summarized as follows.First,for the protection of high-dimensional data and privacy-sensitive data,the FastMap projection is chosen as dimensional reduction method.Because FastMap uses projection to map the distance between data objects into a low-dimensional space,it protects the private data involved in multi-party computation process.Second,we propose a novel clustering integration method,i.e.KMS,which combines the clustering results generated by all parties.By doing so,the clustering results already represent the final clustering for all participating computational parties,and the combined clustering results are better than clustering results from single party.Third,the multicollinearity test is adopted to select the features to participate in the default model based on logistic regression,so as to ensure that the features of operation participation play a significant role in the construction of the default model and thus can enhance the interpretation of the model.Last,the multicollinearity test is adopted for feature selection.The semi-honest model of secure multi-party computation is used as the basis of the combination evaluation platform of loan default models based on FastMap and logistic regression to protect client privacy information and thus we can build a highly reliable default evaluation model.To verify the performance and effect of the proposed method,we evaluate the performance of FastMap on three data sets.The results show that FastMap can generate better sub data sets and has a good effect on noisy data.We use three kinds of consensus functions to combine base clusters of four groups of real data sets from Internet.We compare the results of clustering integration with the real label of data,and the effect of clustering intergration is evaluated by four evaluation methods.The performance of clustering integration is better than simple clustering algorithm,and performance decreases with the increases of noisy data.The noisy data makes it hard to generate correct clusters.By evaluating on real data set,we find that the project of FastMap can effectively protect data privacy as well as can effectively convert data,and thus increases the integration effect of clustering results on high-dimensional data.
Keywords/Search Tags:privacy protection, default model, ensemble clustering, logistic regression
PDF Full Text Request
Related items