Font Size: a A A

Research And Application Of PVI Algorithm On Spatial Data Mining

Posted on:2012-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:D D ZhangFull Text:PDF
GTID:2178330332495571Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
In recent years, the rapid development of the spatial information technology field causes us to enter the time which the information transformed. The data collects continuously through the sensor and other data-acquisition equipment, assumes the exponential order to grow. The people have developed the computer database technology, and have established the spatial information system based on this. But the spatial information system provided is still only a data at present, and could not supply the information outside the data. But the people already no longer satisfy the surface layer data retrieval and the inquiry, hope to get the in-depth data, obtain the knowledge and the discovery. Association rule algorithms discover the knowledge from database. Although they could manage many issues, they spend much time and resources because of mass data. The spatial association rule introduces parallel computing. It reduces the time and resources consumption effectively. The classic association rule algorithms are the Apriori algorithm, the DHP algorithm, the Partition algorithm, etc. They are mainly used in customer consumer analysis, catalogues design, commercials mail analysis, Up-sell, warehouse planning, network failure analysis and so on.Through researching and describing TP-PB algorithm, this paper provides an algorithm which is called PVI algorithm to compute the frequent item sets though a method likes calculating vectors inner-product. It is for remote sensing data mining system. It enables us to be possible to use many inexpensive computing resources to complete the task which the large-scale computer can complete. We can get results in the paper are as follows.1) Compute frequent itemset by simple way. PVI algorithm's mining data is Boolean data. It effectively reduces the algorithm complexity. PVI algorithm compute the frequent item sets though a method likes calculating vectors inner-product, and compute k-frequent itemset by (k-1)-frequent itemset. It greatly simplifies the calculation steps, and improves the data parallelism.2) Reduce the number of scanning database. PVI algorithm need scan database only once because the algorithm record effective information when it compute frequent itemset. However, the algorithm of TP-PB need scan database twice to discover knowledge. The PVI algorithm reduces the I/O operations. The I/O operation of data consume a lot of time. If reduced the number of database scan, it can greatly improve the efficiency of the algorithm.3) Spatial association rules bring in parallel computing. The spatial data is divided into data blocks. The data blocks are processed on parallel computers at the same time. The processing time reduce. The blocks'size should be appropriate.4)The PVI algorithm is embedded in remote sensing data mining system so as to show the remote sensing data and discover knowledge. This system adopts B/S model. The user can query remote sensing data anytime on web browser through the network. This system uses Microsoft advanced graphics development tool WPF and could do pan, amplification, narrow, FullExtent operation, etc. The users submit remote sensing data mining task after setting minimum support degree and minimum confidence. Remote sensing internal rules are stored as text form.
Keywords/Search Tags:spatial data, association rules, parallel computing, frequent itemset, pruning
PDF Full Text Request
Related items