Font Size: a A A

Text Annotation Of The Content Of The Map

Posted on:2015-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:X J LiFull Text:PDF
GTID:2270330431978073Subject:Cartography and Geographic Information Engineering
Abstract/Summary:PDF Full Text Request
With the development of the map annotation service, annotation security and the suitability problem caused by open-annotation-behavior get to be serious. Now, sensitive information discovery and suitability evaluation of map annotation is still unable to attract the people’s attention. This article annlysis annotation text attribute and spatial characteristics, aims to identify the abnormal information of content and make spatial suitability evaluation, so as to green the map environment, ultimately ensure the safety of the map application, we mainly studied the Chinese word segmentation, sensitive word library, text similarity model, the main work is as follows:(1) Double Hashing word segmentation dictionary mechanism research. Chinese word segmentation efficiency directly affects the map annotation suitability evaluation, this paper analysis a variety of word segmentation dictionary mechanism in detail, According to differences of Chinese vocabulary word library and English word, adopts double hash mechanism to complete Chinese word segmentation dictionary. Through the experiment, this paper shows that the segmentation method is simple and fast, and suitable for Chinese word segmentation.(2) Sensitive word library building. Sensitive word thesaurus is widely used to specification and green BBS environment. This paper analysis the characteristics of sensitive words, and on the basis of the BBS sensitive word, consider of the space characteristics, achieve to build a annotation sensitive word library and classfy map sensitive annotation.(3) Multiple pattern matching algorithm. Pattern matching algorithm is the most important in sensitive imformation detection, we firstly introduce single pattern matching algorithm and multimode matching algorithm, then, propose to apply AC-BM algorithm to detect sensitive words in map annotation. For the reason of abnormable annotation contain both English and Chinese, this article unified convert it to Unicode code platform to bulid AC tree and match, finnally achieve rapid to detect sensitive words.This paper mainly focused on the description and identification of annotation exception problem, in a word, we study divid wors method, sensitive word thesaurus, and multipattern matching algorithm, finally achieve to bulid green map.
Keywords/Search Tags:The geographic annotation, Chinese word segmentation. Sensitive worddetetion, Pattern match algorithm
PDF Full Text Request
Related items