Font Size: a A A

Transport Service Oriented Multi-source Mobile Trajectory Data Mining And Multi-level Knowledge Discovery Of Human Activities

Posted on:2013-01-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z W DengFull Text:PDF
GTID:1110330374967758Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
In the smart world and Volunteered Geographic Information (VGI) age, how to obtain and handle humongous amounts of human activity data generated from diverse sources has become a fundamental challenge to scientists and management practitioners in many relevant fields.One such issue is related to mass trajectory databeing generated on a daily basis from inner-city personal travels, which presumably can be utilized for city traffic improvement and transport management. This thesis was intended to seek answers to the research question of how trajectory data, when combined with heterogeneous data from other sources, can be effectively utilized to detect the social activity patterns of urban residents and model their spatiotemporal behavior in the city environment.To achieve this goal, several technical objectives were accomplished, including development ofa trajectory data mining framework, known asScene-Domain Knowledge Driven Framework (SDKDF), to represent and analyze human activities atboth individual and aggregation scales. In addition, data acquisition and processing methods for extraction of activity information were discussed. Major contributions of this thesis consist ofthe following four points.(1) The acquisition, fusion, organization, and quality assessment of heterogeneous human activity data from a variety of sources. The thesis groupedavailable human activity data into three categories:household travel survey (HTS) data, VGI data, and trajectory data automatically generated by intelligent transportation systems.Two innovative explorations were conducted on the HTS methods, i.e., design and implementation of a procedure of coupling passive GPS survey and web-based questionaire survey and a web-based data mining procedure with geospatial recalls for self-reports from respondents. These approaches provided potential to construct an activity database with multiple spatial scales to support hierarchical analsyes. Specifical works included the following.(a) In the approach of coupling GPS and web-based survey technologies, selected GPS data loggers were used to collect respondents'track data for accurate routes and trip detection, while web-based surveys were used to collect personal information from the survey respondents. Four case studies were conducted to test and improve the approach. Results indicated that passive GPS-web-based survey techniques could significantly elevate the accuracy and efficiency of travel data collection and drastically reduce respondents'burden and survey costs. It seemed promising to become a major means in the next generation of HTS.(b) VGI and its websiteswere new and rapidly growing data sources, which providedgreat opportunities for human activity studies at community scale. The thesis designed a standard workflow for data cleaning, fusion, and knowledge mining from several VGI websites.(c) Humongous trajectory data collected from taxi companies wereprocessed to provide large coverage and dynamic representation of human activities in Shanghai. This type of data was proven suitable for macro-level analysis of human behaviors.The thesis synthesized data of the three categories and organized them for the analyses of different scales.(2)Proposal of the Scene-Domain Knowledge Driven Framework (SDKDF) for extracting and analyzingactivityinformation. Traditional data mining only relies on spatiotemporal characteristics of the trajectory data and completelyignores the behavioral patternof and domain knowledge about the respondent. The SDKDF framework presented a top-down approach to reconstruct the activity scene from the heterogeneous geographical background data and formulate reasoning rules with respect to socioeconomic and personal constraints based on the theories of time geography and behavior science.In order to support SDKDF, an object-oriented activity representation data model was developed, which treateda trajectory as a sequence of ordered "stops"(representing locations of individual activities) and "moves"(representing directional movements between adjacent acitivity stops).Such semantic information as trips, travel mode, and trip purposes weresemi-automatically extracted from trajectory data with machine learningtechniques (i.e. C.50).(3)SDKDF based representation and analyses of human activities. According to the object-oriented trajectory representation model proposed above, the move-stop sequence wassemantically labeled to form a representation model of residential travel activity under SDKDF. The semantic labeling was performed by placing the move-stop sequencewithin its geographic background (named a scene window) and with reference to domain knowledge. Travel speed, duration, distance, and other travel characteristics were derived for both stops and moves from their respective spatial and temporal information.The scene window of crucial points (such as OD points)was used toexpress the urban environment (opportunities), and the socioeconomiccharacteristicsof individual respondents were associated with unique PIDs. Data respresented in the model could be readiliy queried and analyzedwith multiple and complex conditions relatedto location, time, personal attributes, and activities. In order to analyze changes in activity patterns, a geo-visualization method known asMinimum Convex Polygon (MCP) was introduced from the biologyfields. The spatiotemporal variation of activity patterns was analyzed using the MCP method to gain better understanding of the complex internal decision-making process under a given personal or family circumstances.(4)Detection and analysis of spatiotemporal structure of large-scale urban activities using mass GPS-based taxi service datasets. Admitting the stochastic nature of taxi movement at the individual level, the generic pattern of taxi services in the duration of a day should theoretically conform to the changing pattern of demand in relation to urban landuse settings and activity schedules. To verify the theory, a GPS-based taxi service dataset with a total of9,349ODrecords was explored to map the one-day activitiesof Shanghai residents in12different time slots. By assuming that the taxi's travel destination be the passenger's trip purpose, the SDKDF method was applied again to obtain the overall spatiotemporal structure of residents' activities. Although further analyses were yet to be conducted for deeper revealation of the activity dynamics, the preliminary results proved the feasibility of this approach to large-scale urban activity analysis and the potential of integrating it into the analytical hierarchy of human activities at various spatial scales.
Keywords/Search Tags:Trajectory data, Sptio-temporal data mining, Human activity analysis, GPS, Scene-Domain Knowledge-Driven Framework
PDF Full Text Request
Related items