Font Size: a A A

Research On Real-world Negative Survey And Its Reconstruction Algorithm For University Students

Posted on:2019-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:J G WuFull Text:PDF
GTID:2428330596466394Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Along with the rapid development of computer network and big data technologies,the condition of sensitive data and personal privacy disclosing is more and more serious.Negative survey is a survey method which can protect personal privacy while collecting sensitive data.Most of previous works focus on negative survey models with specific hypothesis,e.g.,the probability of selecting negative categories follows the uniform distribution or Gaussian distribution.However,the probability of selecting negative categories may not obey any distributions in real negative survey selected manaully.Moreover,based on the papers researched by the author of this thesis,negative survey was used on network's data collection and never be conducted in a manner that negative categories are manually selected by humans in real world.This thesis did a negative survey by manual selection in real world,analyzed the features of data distribution and data reconstruction of real-world negative survey.And then proposed two reasonable reconstruction methods for negative survey selected manually.The main work of this thesis is shown as follows:(1)Did a real-world negative survey and its corresponding positive survey which are selected manually.This thesis analyzed the sensitive questions about university students' study and life,at the meantime,combined the features of negative survey,designed a questionnaire which contains three parts,namely,anonymous positive survey,real-name negative survey and real-name positive survey.And we did the survey in Wuhan University of Technology and China University of Geosciences(Wuhan).After cleaning the data,the quantities of effective data in each part are 811,550 and 528.Then by preliminary statistics,this thesis analyzed the features of negative data with manual selection,and found out some conclusions about the negative data distribution and data reconstruction.(2)Proposed a reconstruction method of negative survey called NStoPS-M,which is based on background knowledge.This thesis got the reconstruction matrix by correlatively analyzing the sample data of the real-name negative survey and real-name positive survey.And put the matrix as background knowledge,proposed the reconstruction method called NStoPS-M.The experimental results show that for most(10/15)questions in the questionnaire,NStoPS-M can got better results than the previous reconstruction methods,like NStoPS and NStoPS-I.Moreover,this thesis analyzed the laws about NStoPS-M in sampling numbers and sampling categories when reconstructing positive data.(3)Proposed a reconstruction method of negative survey called NStoPS-MLE,which is based on the maximum likelihood estimation.NStoPS-M has the problem of negative data when reconstructing.This thesis analyzed the features of negative categories selected,combined the probability formula of multinomial distribution and the constraint conditions of negative survey itself,found out the calculation condition of positive data getting its maximum value when the negative answers are known.This thesis proposed the reconstruction method called NStoPS-MLE based on this idea.The experimental results show that for most(12/15)questions in the questionnaire,NStoPSMLE can got better results than NStoPS and NStoPS-I,including NStoPS-M.And NStoPS-MLE can avoid the problem of negative data of NStoPS-M.Moreover,this thesis analyzed the laws about NStoPS-MLE in sampling numbers and sampling categories when reconstructing positive data.This thesis did a real-world negative survey with manual selection,after analyzing the data collected in negative survey with manual selection,proposed two reconstruction methods which are respectively based on background knowledge and maximum likelihood estimation.And verified there presise by experiments.Then found out some laws in reconstruction results when reconstructing positive data.The work of this thesis can provide some significant guidances for researching on theory and application of negative survey.
Keywords/Search Tags:Privacy Protection, Negative Survey, Background Knowledge, Reconstruction Method, Maximum Likelihood Estimation
PDF Full Text Request
Related items