Font Size: a A A

The Study Of Distributed Spatial Data Mining Algorithm Based On Grid Technology

Posted on:2009-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:B HuFull Text:PDF
GTID:2178360245982362Subject:Geodesy and Survey Engineering
Abstract/Summary:PDF Full Text Request
Spatial data mining is the inevitable result of the developing of spatial information technology, and its birth has two main factors. Firstly, as the research field expanded constantly, the object of data mining has evolved from relationship and transaction data to spatial database. Secondly, as the abroad application of satellite and remote sensing technology in geographical science, abundant spatial and non-spatial data has been collected and stored, they have exceeded people's processing capability at a certain extent, but the traditional spatial analysis can hardly pick-up and discover geographical knowledge from these great capacity data. As John Naisbett said: " We have been submerged by information, but been suffered from absent knowledge." So, mining knowledge automatically, finding ambiguous and hidden knowledge, spatial relationship or other patterns from spatial database , namely spatial data mining, has become more and more important.Whereas, GIS spatial database commonly has characters such as great capacity , distributed storage, and so on, using SDM technology for acquiring hidden knowledge and information from spatial database or data warehouse will confront great challenge , not only the efficiency of data processing , but the self security of spatial data, if we only introduce the central process model. So, distributed and parallel data mining pattern is one of hot problems of research currently. But the research and development of spatial knowledge grid provide good computing environment and application prospect for data mining.This dissertation starts with serial algorithm of finding association rules, then discusses the problem of parallel process of association rules mining, and studies the relational architecture which applies grid technology to spatial data mining. The main contribution of this dissertation is as following:i. Systemic research of spatial association rules algorithms. In the first place, I discuss Apriori,FP-growth algorithm respectively and then give several methods of improving their performance, meanwhile, by running two algorithms on five different size dataset, I contrast their performance; secondly, I investigate the theory of applying spatial statistics to spatial data mining, and use 2004 to 2006 year GDP growth rate of Hunan province as an instance, successfully get the spatial association relationship of GDP growth rate of 13 states;ii. The parallel process models of spatial association rules are presented. For adapting the great capacity and distributed storage of GIS spatial database, the thesis gives the general structure of distributed algorithm of finding spatial association rules, and then introduce 4 parallel computing methods which are CD algorithm, CD-LGP algorithm, DD algorithm and HD algorithm, and analyze their performance respectively.iii. Gives the architecture of spatial data mining based on grid technology. Following the OGSA architecture, the thesis analyses the basic characters of service-oriented architecture of spatial data mining, and deeply discusses the implementation mode and flow of data access service, data agent service and spatial data mining service.iv. A general strategy and method of spatial data division based on grid environment is studied. By grid simulation experiment based on GridSim, prove that the data division strategy exists one most excellent solution under special condition, so provide an evidence for automatically decomposing the whole tasks and optimum arrange in grid after grid agent accepts tasks which are submitted by users.
Keywords/Search Tags:grid technology, spatial data mining, distributed algorithm, grid service, data partition
PDF Full Text Request
Related items