Font Size: a A A

Research On Spatial Data Mining Of Network Self-Media

Posted on:2019-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhongFull Text:PDF
GTID:2348330548457925Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
Spatial data mining(SDM)is a discipline that explores spatial databases or spatial information contained within spatial entities.Spatial data mining includes the main methods of cluster analysis,spatial analysis,and data visualization methods.The cluster analysis method means that the samples in the dataset are divided into clusters according to their similarities.The similarity among the samples in the same cluster is higher.The microblog platform publishes PB-level data every day.These data contain information about social and life aspects.This article treats every micro-blog user as a spatial entity and uses cluster analysis to perform data mining on data with location attributes in micro-blog to discover hot-spot words related to current society and life in micro-blog data.Through the visualization,the samples in the clustering results are presented on the map to study the spatial distribution of the samples.The main algorithm used in clustering analysis is k-means algorithm.Algorithm used to achieve Hadoop with mahout distributed computing platform.The differences between k-means algorithm and k-means algorithm optimized by Canopy algorithm in text clustering are compared,and the changes of convergence speed,number of iterations and distance between two clusters under different input parameters are compared.Finally,the quality of k-means optimized by Canopy algorithm is obviously higher than that of ordinary kmeans clustering.However,in the topic of textual clusters,it does not have a great impact,but reduces the similarity between clusters,Preventing multiple categories of topics from one theme.Visual analysis using Arc GIS and WebGIS to achieve nuclear density analysis of clusters,and then doing raster analysis of fishing nets can make discrete cluster samples have adjacency,but also allow us to intuitively see the main distribution of cluster topics.
Keywords/Search Tags:k-means, canopy, micro-blog, cluster analysis, spatial data mining, WebGIS, ArcGIS
PDF Full Text Request
Related items