Font Size: a A A

Research On Intelligent Clustering Learning Algorithm For GNSS Data

Posted on:2019-07-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:X B ZhouFull Text:PDF
GTID:1360330548477632Subject:Earth Exploration and Information Technology
Abstract/Summary:PDF Full Text Request
The prevalence of smart Global Navigation Satellite System(GNNS)devices has facilitated people to track moving objects based on mobile internet,result in a large amount of GNSS-based data and trajectories can be generated and be used to explain geological and spatial changes.However,the rapidly growing GPS data plays an important role in earth exploration and supports development of earth exploration technology.There is a huge amount of hidden information behind GPS data or trajectories,which are very useful in providing many services and supporting 3S-based applications(GIS(Geographic Information System),RS(Remote Sensing)and GPS)development such as land use of city,navigation and positioning,the population migration distribution and traffic flow of a city;and which has become a research hotspot of earth exploration field under the big data era.Therefore,in order to mine hidden information of GPS data using partion-based clustering algorithms(K-means/K-means++,Fuzzy C-means,K-median)and overcome their shortcomings including slowness of convergence,sensitivity to initial seeds selection,difficulty to take advantage of good quality clusters of the previous iterations,and getting stuck in a local optimum.This paper proposes and focuses on several intelligent algorithms(genetic algorithm(GA),particle swarm optimization(PSO)and ant colony optimization(ACO))with noise,density,gene rearrangement technology,niches,fuzzy system,canopy,MapReduce for partition clustering algorithms,which are used to achieve automatic clustering operations and then find values of GPS data for urban development and earth exploration in some given GPS datasets.Meanwhile,the methods can also be used to improve premature convergence,enhance performance of global optimization and maintain diversity for intelligent learning algorithms.The main contributions of the paper are summarized as follows:(1)An improved noise method,density,Canopy method,and K-means++ are presented to construct three different initial populations with a low complexity and capture higher quality seeds that can automatically determine the proper number of clusters,and also handle the different sizes and shapes of gens,which are designated as NoiseClust,SeedClust and NicheClust,respectively.In NoiseClust,SeedClust and NicheClust,the improved noise method and K-means++,density-based method and an improved K-means++,density-based and an improved Canopy method is presented to produce the initial population and capture higher initial seeds,respectively.In particular,a density-based and sharing function method and is presented to divide the number of niches and maintain population diversity in the NoiseClust,respectively.The improved gene rearrangement technology,and adaptive probabilities of crossover and mutation are used in NoiseClust,SeedClust and NicheClust prevent the convergence to a local optimum;after that these methods are integrated in genetic algorithm and used to capture best chromosome.Finally,the cluster centers(seeds from best chromosome)are obtained and then fed into the partitioning Clustering algorithms as initial seeds to generate even higher quality clustering results by allowing the initial seeds to readjust as needed.Experimental results based on taxi GPS data sets demonstrate that NoiseClust,SeedClust and NicheClust have high performance and effectiveness,and easily mine the urban situations.(2)A hybrid method combining GA with fuzzy PSO(FPSO)is presented,which is used to achieve highly partitioning clustering.In this method,global optimization ability of PSO in the early stage of iteration is used to avoid premature convergence of GA,and the crossover,mutation and elitist operations of GA are used to update particle positioning and maintain diversity of particle swarm.A fuzzy system is designed to produce parameters(inertia weight and learning factors)of PSO in order to achieve adaptive operation in terms of fitness value changes.Therefore,the adaptive fuzzy system is first built according to the triangular fuzzy membership function,fuzzy rule and Mamdani fuzzy reasoning and then is integrated in PSO,and the noise method in(1)is used to produce particle swarm.Secondly,the FPSO is presented to produce the better individual and then are used to achieve GA operations.Finally,the positioning of best individual is used to handle K-means clustering;or K-means clustering is combined in FPSO and GA,if terminating condition is met,then the best individual is chosen as K-means clustering results.Experimental results demonstrate that the hybrid methods have higher clustering performance.(3)A novel hybrid clustering method combining GA and fuzzy ACO(FACO)is proposed to achieve GPS data clustering,which can be used to improve initial seeds selection and avoid getting stuck in a local optimum.Namely,the method can capture clustering centers in optimization processing without requiring to set initial seeds in advance.Therefore,firstly,fuzzy system is used to produce parameters of ACO in terms of fitness values changes and ACO attributes,which can construct a novel ACO algorithm(adaptive ACO or FACO).Secondly,the noise method in(1)is used to produce initial population and achieve GA operation in order to capture the best chromosome,it indicates that the number of clusters has been captured and then can be used to handle FACO operations.Thirdly,local pheromone and global updating methods of ACO are used to prevent search stagnation.Finally,the number of clusters and the optimal seeds are simultaneous captured and used to K-means clustering.Experimental results demonstrate that the hybrid methods have very good clustering performance and higher cluster evaluation value.(4)In order to improve clustering performance of big data sets and find more cluster centers,large volume GPS data clustering method based on novel GA,is proposed in cloud computing.Firstly,Canopy and K-means++ based on MapReduce,is proposed to achieve initial population,where,in order to capture the proper number of clusters in big data sets,threshold of Canopy is adjusted;and a sampling method is used in K-means++ in order to choose more appropriate seeds.Secondly,the population is used to perform GA operations and finally capture the best chromosome.Thirdly,the best chromosome is used to achieve MapReduce-based K-means clustering.Finally,experimental results based on big data sets(1.9M,19 M and 208M)indicate that the presented method has high performance and a low computing cost.(5)A trajectory regression clustering method(an unsupervised trajectory clustering method)is proposed to reduce local information loss of the trajectory and avoid getting stuck in the local optimum.Using this method,we first define our new concept of trajectory clustering and construct a novel partitioning(angle-based partitioning)method of line segments;second,the Lagrange-based method and Hausdorff-based K-means++ are integrated in fuzzy C-means(FCM)clustering,which are used to maintain the stability and robustness of the clustering process;finally,least squares regression model is employed to achieve regression clustering of the trajectory.In our experiment,the performance and effectiveness of our method is validated against real-world taxi GPS data.When comparing our clustering algorithm with the partition-based clustering algorithms(K-means,K-median,and FCM),our experimental results indicate that the presented method is more effective and generates more reasonable trajectory.
Keywords/Search Tags:Global Navigation Satellite System (GNSS) data, intelligent learing, automatic clustering algorithm, MapReduce, GPS trajectory/GPS data
PDF Full Text Request
Related items