Font Size: a A A

Min-Max Modular SVM Based On Cloud Platform

Posted on:2016-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:2308330473465505Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Support vector machine (SVM) is a common pattern classification method. Its theoretic basis is VC dimension and structural risk minimization principle. It has been successfully applied to deal with different problems in various area due to their powerful learningability and good generalization performance. However, SVM requires to solve a quadratic optimization problem and costs much training time that is at least quadratic to the number training samples. Therefore, the traditional SVM is hard to learn large-scale problems, or even unable.Prof. Lu proposed Min-Max Modular Support Vector (M3-SVM) algorithm with aim to improve the efficiency of SVM. M3-SVM is an efficient algorithm based on "divide-and-conquer" strategy. It decomposes large-scale problems into a massive independent and smaller subproblems, which is learned independently. This can significantly reduce the time of learning, however, these subproblems are still trained in serial. In order to further improve the efficiency of M3-SVM, we will parralelly implement M3-SVM based on Hadoop platform.This thesis focuses on the following three aspects:First, this thesis introduces the M3-SVM parallelization based on MapReduce framework. Distributed parallel computing was considered when M3-SVM was firstly proposed. Thus, M3-SVM and MapReduce parallel framework could be combined together to improve the efficiency of M3-SVM algorithm. Experimental results have shown the reliability and high computing efficiency of paralleled M3-SVM based on MapReduce.Second, It is about the study of concept drift based on cloud platform when dealing with classification problems.In reality,most of the data is produced in the form of streaming, and the concept of data may drift as the time goes.therefore, it is a great chanllenge for traditional method to detect the drift timely and train a model suitable for current data streams. To deal with such problems, a model named paralleled M3-SVM was proposed to classify data streams.It turns out to be excellent in detecting concept drift timely and constructing accurate models for current data.Thirdly, the min-max module support vector machine classification system based on cloud platform is developed. To deal with the problem of pattern classification, we designed a system which is security, user-friendly, interactive, scalable and maintainable. The greatest advantage of the system is that it visualize pattern classification problem. The system includes some function modules, such as the selection of platform, loading of datasets and pattern classification.
Keywords/Search Tags:Support Vector Machine, Min-Max Module Support Vector Machine, Cloud Computing, Parallelization, Concept Drift
PDF Full Text Request
Related items