Font Size: a A A

Retrieval Method Of Events In Web Pages Based On Spatio-temporal Elements

Posted on:2014-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:C L DuFull Text:PDF
GTID:2268330401969767Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
This thesis is supported by the national "863" project "The Research of Associated Updating of the Ubiquitous Spatial Information and Mining of Subject-oriented Spatio-temporal Information", which explores an approach to accessing and retrieving of the web pages for specific events. The research contributions will provide a data support for the structured expression, reconstruction of temporal sequence, visualization and mining analysis of event information, which is from the multi-source network information. In case of events of natural disasters,the research is carried out according to the main idea of "acquisition,storage and organization, retrieval service" of event web pages. The following is the main contents and methods:(1) Acquisition of event web pages based on spatio-temporal elements. With an analysisof the contents and features of the event pagedescriptions, abasicevent templateis constructed, which contains time, spatial location and eventelements. Based on thistemplate,a web crawleris designed to get the web pages greatly related to the specific events. The experiments show that, compared with the traditional crawler, the proposedevent crawler has a good function on web filtering and a high accuracy on accessing the Web pages. However, the large number of calculations in the crawlerresults in a drop in the performance.(2) Distribution index and storage of "Time-Space-Theme" ofthe event web pages. The time and spatial information in the event web pages are extracted based on the rule andconditional random field model. A classification of the event web pages isproposed with the Support Vector Machine Modeling. To overcome the low efficiency of event retrieving, a distribution index based on the "Time-Space-Theme" was designed. The massive web pages are distributed stored with the HBase database and HDFS (Hadoop Distributed File System)file system.(3)"Text-Map" interactiveretrieval servicesof event Web pages. Automatic interpretation of retrievalsentences of the event information is realized by summarizing the description features of the retrieve data. By introduce of the vocabulary organization structures of the synonymy substitution for Chinese text, a lexical knowledge dictionary and similarity retrieval model of natural disaster events was constructed. The similarity calculation and sorting of the candidate web pages and retrieval conditions is realized.(4) Developing of the prototype system. Based on the above studies, a prototype system is designed with the system configuration environment and Google Map API. Thissystemincludes the functions of acquisition, distribution index and storage of web pages. Meanwhile, it can provide retrieval services of event web pages with anintegration of text and map.
Keywords/Search Tags:web page, event, spatio-temporal elements, retrieve, index of"Time-Space-Theme"
PDF Full Text Request
Related items