Font Size: a A A

Spatial-Temporal Data Mining Based On GPS Trajectory And Geo-Tagged Photo Trajectory

Posted on:2014-07-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:G N WangFull Text:PDF
GTID:1268330401479046Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Recent advancements of information and location-aware technologies have enhanced our capability of collecting individual trajectory data of people, vehicles, or other moving objects. The analysis of trajectory data which enables us to discover valuable information and infer new knowledge has been a hot research in the interdisciplinary field between computer science and geographic information science. Furthermore, a branch of geographic applications based on user-generated trajectory data has appeared on the Web and received considerable attention. Then the research on the spatial-temple data mining has been a hot issue. On the base of the recent researches, this PhD thesis mainly aims at making research on the GPS trajectory and geo-tagged photo trajectory based data mining methods. Such methods can provide effective service for the traffic, travel, personal service recommendation, and etc. The results of real trajectory data experiments show good performance.This thesis mainly focuses on the following four key points:1) Firstly, we propose a series of Geometric Similarity Algorithms (GSAs) to geographically analyze the real GPS trajectory. Such trajectory similarity is important to road networks, traffic and geographic systems by effectively retrieving the information with high relevance. In our approach, we first propose a Length-Angle Ratio to detect the significant regions in the trajectory, and then we measure the trajectory similarity by considering the differences between geometric features of two trajectories. Additionally, we take into account both the personality of each traveler and the uniqueness of each trajectory by fully analyzing the geometric features of them. In the experiment, we evaluate the proposed method using the collected actual geographic location data in the experiment. The results show a good performance, furthermore, the proposed method has an advantage over the existing method in accuracy and computing efficiency. 2) Secondly, we propose a novel travel route restoring method to analyze the geo-tagged photo trajectory. Sharing geo-tagged photos has been a hot social activity in the daily life because these photos not only contain geo information but also indicate people’s hobbies, intention and mobility patterns. However, the present raw geo-tagged photo routes cannot provide information as enough as complete GPS trajectories due to the defects hidden in them. In our approach we first propose an Interest Measure ratio to rank the hot spots based on density-based spatial clustering arithmetic. Then we apply the Hidden Semi-Markov model and Mean Value method to demonstrate migration discipline in the hot spots and restore the significant region sequence into complete GPS trajectory. At the end of the paper, a novel experiment method is designed to demonstrate that the approach is feasible in restoring route, and there is a good performance.3) Thirdly, we study the travel pattern hidden in the Geo-tagged photo trajectories. Mastering the basic laws of travel activity is significant in the application of travel planning, forecasting and recommending. Though there have been many similar researches, our understanding remains limited thanks to the lacks of tools to monitor the time-resolved location of individuals. Here we study the geo-tagged photo trajectories scrawled from the web of Flickr. We find that many parameters of travel walks follow power-law and further appear heavy tail (log-normal distribution), such as the Length-Angle Ratio which can help us find the significant regions (sig-regions), the stay time of travelers in sig-regions and the distance between sig-regions. Besides the common statistical features, the travel trajectories also show a high degree of temporal and spatial regularity. In order to further study this regularity," why log-normal distribution" about travel flight is explained, and a research on "how to decide the next destination" is made to go deep into the travel patterns. Additionally, our work points out that there exist differences between regular human walks and travel walks due to the big differences of properties hidden in them. These common statistical features and properties are important for the study of human travel activity and can also help in the further recommendation or forecasting applications.4) Finally, we analyze the censored data in trajectories with survival analysis method. For various reasons time intervals can be only observed and measured partially, it is censored-problem; ordinary regression models can not treat censored data, so it is necessary to establish censored data based models. The interval time T between two photos are right censored, and we take T as example to make survival analysis. We first establish nonparametric model of T with Kaplan-Meier estimator, and study the corresponding survival model and hazard function. Then we establish COX model and Buckley-James (B-J) model to illustrate the relationship between T and distance between photos, and estimate the confidence interval with empirical likelihood method. At last, we establish semi-parametric model of T and the basic information of users. The present study is mainly designed to use empirical likelihood (EL) method based on synthetic dependent data, and the result cannot be applied directly due to the weights in it. In this thesis, a censored empirical log-likelihood ratio is introduced to tackle this problem. Particularly, we demonstrate that its limiting distribution is a standard chi-squared distribution. This method is used to calculate the p-value and construct the confidence interval. Some simulation studies are conducted to highlight the performance of the proposed EL method, and the results show that it performs well.In this thesis, we provide technical support for the current fast development of information technology and spatial-temple data by proposing convenient and precise mining algorithms. The algorithms we proposed are not confined in the field of computing, their corresponding construction and technologies can be used in other researches and applications. The work in this thesis enriches the theory and methods in the field of spatial-temple data mining and has extensive applicability and practical significance.
Keywords/Search Tags:Temple-Spatial data mining, GPS trajectory, Geo-taggedphoto trajectory, trajectory similarity, power-law, hidden semi-Markovmodel, survival analysis
PDF Full Text Request
Related items