Font Size: a A A

Geographic Information Changes Recognition Based On Massive Text Information Mining

Posted on:2014-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ZhouFull Text:PDF
GTID:2268330401965492Subject:Cartography and Geographic Information Engineering
Abstract/Summary:PDF Full Text Request
The rapid change and expansion of the Internet technology has brought about anexplosion of information on Internet. The emergence of the search engine does a greatcontribution to acquire the information in request more precisely and quickly. About1/5of input queries on the search engine were related to the geographic information. To getuseful geographical information from search engines is an important direction inGeographic Information System (GIS) research. The majority of geographic informationon the Internet is implicated in massive amounts of unstructured information. Thesearch results are lack of accuracy and in large amount, feeding back from usersreferring to search engines based on their own requests. It is difficult to extract theirtarget information from the complex Internet information efficiently. Therefore, how toextract the geographical information from web pages which contains this topic contentsand how to extract the changed geographic information from the vast amounts ofinformation are the main content of this study.This paper proposes some solutions for the extraction of changed Internetgeographic information, including general idea of Geographic Information DiscoverySystem based on text mining and discovery method based on the theme informationextraction and filtering. And an Internet-based solution and a system framework whichis capable of renewing and iterating geographic data in time are studied. A GeographicInformation Discovery System was developed and implemented. Major studies ofresearch are as follow:(1) In the aspect of data analysis, the system comparatively analyzes the largeamount of text description acquired from the Internet retrieval and extraction of changeinformation with troponin database of Sichuan Province and administrative boundariesdata, in this way to acquire the general location of the changes of the feature on surface.Then, the system restores the information as the location parameter of the changedfeature, to collaborating it with path analysis tool. All the process above save theupdating personnel to reach the location from complication and inconvenience. In the data display, a system of signs displayed in the form of thematic information has beendesigned, based on various parameters abstracted from spatial information of changedfeatures. Thus, the platform maintainers can visually analyze all types of informationabout the distribution of changes in features on ground, and scientifically developingstrategies and means to update change information.(2) The main modules included: database management module, informationretrieval module, theme extraction module, assistant analysis module and result displaymodule. Most importantly, the Theme Extraction Module matches keyword that is notonly considered the pattern of sentence, but also in a way of understanding semantics,that is positively influenced the accuracy of information retrieving.(3) The system was implemented and the result of display analyzed. Theexperiment shows,that the effectiveness has been improved dramatically, becausenumber of the result reduces significantly from the original10,000to3000. Theexperiment results show that the average accurate rate is55%, and in same cases thefigures even reach90%.
Keywords/Search Tags:changes of geographic information, text mining, theme extraction andfiltering, information retrieval
PDF Full Text Request
Related items