Font Size: a A A

Mining And Privacy Protection On Activity Trajectory Data

Posted on:2020-01-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:X M LiFull Text:PDF
GTID:1368330578481651Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of various sensors and mobile Internet,people's daily life is sensed all the time.GPS is widely used in smart phones,which makes it easy for users to get the current location and map navigation.In recent years,the rise of location-based social network also allows users to share current location and event more easily,which improves human mobility.As people enjoy these location-based services,a large number of trajectory data is recorded.These trajectory data not only contain the time information and the users' locations,usually also contains users' current activities,which we call activity trajectory.On the one hand,from the perspective of data mining,the analysis of these data will help to learn the users' behavior patterns,and then provide personalized services such as life suggestion and location recommendation and so on,thus improving the quality of life.On the other hand,from the perspective of privacy protection,these data may contain users' sensitive information.Inappropriate publishing and dissemination may lead to privacy leakage and thus brings negative individual and social impact.The basis of the two issues,or the bridge linking them,is human behavior understanding and modeling.Our research focuses on these three aspects,including the following:User behavior modeling on activity trajectories.Understanding and modeling of human mobile data play an important role in many fields.Compared with tradi-tional GPS data,activity trajectories usually contains rich activity information and user profiles in addition to time and location information,which help to understand user behaviors better.Firstly we introduce two simple behavior models based on Markov chain.Then we propose a new behavior model based on topic model.Making full use of activity trajectories,the proposed model not only considers hidden states behind user behaviors and transitions between states,but also considers the similarity between different users,and the influence of user profiles.Next we discuss the implementa-tion details and evaluation of the proposed model.Finally,the comparative analysis of these two types of behavior models are carried out.This research is the foundation of the following study on location prediction and privacy protection.Location prediction on activity trajectories.Location prediction is a typical task in the field of data mining.Existing location predictors usually do not take time and the rich activity information into consideration.We propose a time-stamped activity infer-ence enhanced location predictor.It consists of two steps.The first step is the inference of the next time and activity information based on the behavior model proposed before.The second step is integrating the various factors on the user's next location through a probabilistic mixture model which uses inferred time and activity information in the previous step.Finally the prediction result is obtained by marginalizing the joint dis-tribution over location,time and activity.Experiment results show that our proposed algorithm outperforms the baselines on both time-stamped activity inference and loca-tion prediction,which demonstrates that time and activity information do help to the performance of location prediction.Through the analysis of the factors which affect the accuracy of location prediction,we find that trajectory length and the number of visited locations have little influence on the accuracy of location prediction,but the regularity of human behaviors does to some extent.In addition,for most users,the current location and the next time stamp make the greatest contribution on the next location.Privacy-preserving activity trajectory publishing.Nowadays data privacy is a hot research topic.We study privacy-preserving activity trajectory publishing algo-rithms,aiming at protecting user privacy as well as ensuring the quality of published data set.At first,we define the privacy protection problem from four aspects:user model,adversary model,privacy requirement and data quality.Then we propose two publishing algorithms based on previously proposed behavior models.We discuss the algorithm implementation details,prove that the proposed algorithms both satisfy the privacy requirement and that they both optimize data quality in some sense.Finally,we test the performance of the two algorithms by comparing them with several baseline algorithms on a campus smart card data set,including privacy breach rate,data quality and running time.The results show that our proposed algorithms are able to preserve users' privacy.Besides,we put forward a method to further improve the data quality and reduce the running time.
Keywords/Search Tags:activity trajectory, data mining, privacy, privacy protection, behavior modeling, topic model, location prediction
PDF Full Text Request
Related items