Font Size: a A A

Multi-source Poi Information Fusion Based On Natural Language Processing

Posted on:2014-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:R S LiFull Text:PDF
GTID:2268330401484707Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, due to the rapid development of location-based services,especially on the network map, mobile location services (LBS), automatic portablenavigation (PND), it has been difficult that the original point of interest (POI) supportsuch services. Access to the high quality POI information, which was extremelyimportant to location-based services. With the rapid growth in consumption, moreattention has been placed on dining, entertainment, tourism and other fields in dailyconsumption. At the same time, it led to many information providers about this area,and the information they provided was informative and immediate.In light of above, how to obtain numerous valuable POI information contained inweb became a hot issue. Correcting and integrating these existing POI to get thestructured valuable data are significant theoretical and practical. This papersystematically has studied multi-source poi information fusion, which includesexpressing POI various features, classifying POI-fusion, uniforming coordinate field,solving the problem of limited network access, and so on. Specific research andresults are as follows:(1) Through the analysis of POI various field form and features, paper proposedPOI characteristic similarity used to indicate relations between POI-fusion and theoriginal POI collection to complete judgment. POI characteristic similarity mainlycomprised of the name similarity, address similarity and coordinate similarity. Namepart is calculated from several classic strings matching method, and address part isbased on the section similar, and latitude-longitude part is the distance between twoPOI.(2) POI’s coordinate that appears in this paper comes from different networkelectronic map, and the coordinates of same entity on different maps are inconsistentthat has certain influence to the later POI fusion work. To solve the problem that the coordinate standards are not unified, this paper mentions two solutions. One is basedon the correction table, and the other is based on API.(3) Build a classification model based on rules. In the process, paper setscoefficient and threshold for POI various field, do regression calculations, and selectthe threshold which distinguish poi-fusin best to build decision model. Thiscalculation process is complex, time-consuming, unflexible, and it do not haveauto-learning capability. So paper uses machine learning classifiers which have activelearning capacities to structure several different classification models. Then select thebetter classifier which effectively improve the classification performance.Paper’s innovations are as follows:(1) Because there exist words and expressions, different Chinese character hasthe different relatedness. Considering that, this paper supposed the smallest unit thatChinese string match is the word, and no longer extends with the traditionalsupposition that the smallest unit is a single Chinese character.(2) Integrated non-spatial and spatial information of POI, and use it as a basis forPOI-fusion classification. Then through a model based on rules to categorize POI.(3) Using the classified method about machine learning, build a POI fusionclassification model with self-learning ability.Experiments show that technique presented in this paper can automaticly andeffectively classify multi-source POI without human intervention.
Keywords/Search Tags:data, fusion, POI, classification, title, geographic information
PDF Full Text Request
Related items