Font Size: a A A

The Improvement Of Affinity Propagation Clustering Algorithm And Its Application On Haze Prediction

Posted on:2019-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:D D JuFull Text:PDF
GTID:2371330548951860Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and database technology,data has exploded.Data mining,as a new analysis technology of big data,has been widely concerned in many fields.Clustering analysis is an important branch of data mining,which can extract potential value information from massive data and provide effective basis for scientific decision-making through unsupervised learning.Affinity Propogation(AP)clustering,as one of the most popular clustering algorithms,has been widely used in many fields.But there are also some deficiencies of AP algorithm.As affinity propogation(AP)clustering is difficult to construct effective similarity matrix on complex dataset with adhesion samples,an AP algorithm based on weighed-factor and neighborhood-based density factor(NDF)is proposed.The algorithm uses weighed-factor to improve the calculation of local density of data and introduces NDF into AP clustering to divide the datasets into core point and marginal point,then constructs connected graph in different ways according to the type of point,finally,the similarity matrix which reflects the real distribution of data can be calculated to perform clustering.Simulation experiments on artificial data sets and UCI standard datasets show that the algorithm improves the accuracy of clustering,which verifies the effectiveness and feasibility of this algorithm.In addition,combining the improved affinity propagation algorithm and local density as a way of under sampling,an under-sampling SVM algorithm based on improved affinity propagation clustering and local density is proposed,in order to solve the SVM algorithm for imbalanced data sets classification problems,and the effectiveness of the algorithm is verified by experiment.Finally,according to the unbalanced characteristics of the haze of data,this algorithm is used to construct a prediction model based on the haze.The air quality data in Beijing area are selected as samples,and compared with other haze prediction algorithm models,it is proved that this model improves the accuracy of haze prediction.
Keywords/Search Tags:Affinity propogation, Weighed-factor, Neighborhood-based density factor, Imbalanced data, SVM, Prediction of haze
PDF Full Text Request
Related items