Font Size: a A A

The Research Of Sandstorm Meteorological Data Mining Based-on Cloud Computing And SVM

Posted on:2015-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:P X ZhangFull Text:PDF
GTID:2298330467466043Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the numerical prediction model developing continuously, each meteorological station use the numerical prediction method as the main means of weather forecast. While the atmospheric motion is very complicated, and a lot of atmospheric motion has not been thoroughly studied, the accuracy of detailed forecast of numerical prediction on meteorological factors is not high. Meteorological observation data is mass data, which has the characteristics of space-time attribute, multi-dimension, complexity, high correlation and so on. Due to the need of meteorological research and business, mining and discovering the meteorological characteristics and laws hidden behind the data information by the meteorological observation data that detected, of which the process itself is a process of data mining. But the traditional data mining algorithm has been difficult to meet demand, so solving the bottleneck problem of mining algorithm in the aspects such as efficiency, adaptability and availability becomes more and more important.Support vector machine is a kind of classification algorithm based on statistical learning theory, and it has been widely applied in the field of machine learning. Its advantages are embodied in rarely appearing excessive fitting, for the dimension disaster caused by too much characteristics is not obvious, the convergent solution is global optimal solution, and flexible usage of kernel function, etc. With the increase of training set size, support vector machine learning becomes the intensive calculation process. So the traditional S VM is not suitable to be used in a large sample system, mainly because of a large number of data sets make SVM training time greatly increases, training speed reduces and the training result is not ideal.Cloud computing is a new type of computing model, and it brings new solutions for storage and computing of mass data. As for the problems such as large memory, slow speed optimization and others during computational process, this paper deeply studied the parallel algorithm of SVM and the parallel programming technology of MapReduce, and it carried on the concrete analysis to the existing parallel strategies, what’s more, it showed the solution of the problem that SVM took large amount of calculation when training on large data sets. The contributions of this paper are shown as follows.(1) It has carried on the performance analysis of SVM grouping parallel strategy that put forward by Deng Kun and others and cascade training model that put forward by Grafs and others through a large number of experiments;(2) It has shown the SVM parallel algorithm of cascading grouping by combining group training and cascade training model, and has realized by using parallel framework of MapReduce, which has solved the problem of low efficiency of the cascade training model. The experiment results show that, in cases of keeping the loss of accuracy smaller, it has speeded up the convergence speed of training model, and has improved the execution efficiency of the algorithm.(3) This paper analyzed and studied the meteorological factors that affecting the formation of sand storms.
Keywords/Search Tags:Meteorological Data Mining, Cloud Computing, SVM, MapReduce
PDF Full Text Request
Related items