Font Size: a A A

Privacy-aware Multi-type Health Data Missing Value Prediction And Application

Posted on:2022-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:L Z KongFull Text:PDF
GTID:2518306323486864Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the vigorous development of the national economy,economic level and life quality of the most people grow rapidly.When people are persuing high quality life,they pay much more attention to their own health status.In people's daily life,they often monitor their health status through some health data,such as the medical examination data in the hospital,the data in the sports bracelet and the data in some sports and health software and so on.Especially in the age of big data,there are many ways to access health data for people.Hence people's health data is accumulating gradually.Today,data is becoming more and more important and the significance of health data is becoming increasingly apparent.Through the health data,people can have a more intuitive understanding of their own health status,and can clear the needs of the body and adjust their health according to health data.However,in real life,there are various reasons to cause the problem of loss of data,such as system updating,equipment failure,operation error and other reasons.Today the loss of data is a very common problem in all walks of life,including the field of health care.Missing data problem will result in the incompleteness of people's information,which makes it unavailable to use people's history data.It requires us to predict and impute missing health data.There are two major problems in the process of predicting and imputing missing health data:(1)It is inevitable to leak users privacy when we predict missing data based on historical users data.Hence,defense of private users information is a vital requirement and it is a challenging practical problem.(2)Users health data is very complex.They generally contain multiple dimensions,and each dimension may contain different data types(such as continuous data,discrete data and Boolean data,etc.),which brings great challenges to the prediction and imputation of missing health data.In the above situation,we aim to predict and impute missing health data accurately in the meanwhile preserving users privacy.Further,we apply this approach to real life.Specific research work of this paper is as follows:(1)Locality-Sensitive Hashing technique is introduced in this paper to address missing data problem.Locality-Sensitive Hashing is a valid technique to cope with approximate nearest neighbor search problem.Through Locality-Sensitive Hashing,we can build index for users offline,then find similar neighbors of a target user based on his/her index.Moreover,through projecting users data with some sensitive information into indices that includes nothing sensitive,we achieve the goal of privacy preserving.Further,to verify the validity of our approach in dealing with multi-type data and attaining users privacy,a wide range of experiments are conducted on WISDM dataset.(2)The above approach is applied to the healthcare field.Then we develop a system to predict and impute missing health data.We analyze the feasibility,functional requirement and non-functional requirement of the system firstly.Then we design the overall architecture of this system.Finally,we develop the functional modules of this system with Java language in the Eclipse development environment.
Keywords/Search Tags:Multi-Type health data, Missing data prediction, Privacy-Preservation, Locality-Sensitive Hashing
PDF Full Text Request
Related items