Font Size: a A A

Research Of Handwritten Digit Recognition Based On SVM

Posted on:2007-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:L L WuFull Text:PDF
GTID:2178360182496981Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Support Vector Machine (SVM) is a kind of statistical learning theory for classificationand regression, which is proposed by Vapnik in 1995. It's a learning system using linearfunction to assume space in high-dimension feature space. In the recent years, it has gotbreakthrough improvement in its theory research and algorithm implements, and therefore itbecomes a powerful method of overcoming the traditional difficulties such as dimensiondisaster, overfitting, and so on. SVM is being paid more attention to because of manyremarkable advantages and promising performance in experiments. It has been the enthusiasticof the machine learning research domain and it has got very ideal effect, such as facerecognition, handwritten digit recognition, web classification, and so on.Handwritten digit recognition has wide applicative foreground in many domains, andscholars inside and outside have done much research work on it. They have reported manypreprocessing algorithms and pattern recognition algorithms, which improves the accuracy ofhandwritten digit recognition in great measure. But up to now, the recognition accuracy stillneed to be improved and the problem of selecting kernel functions and kernel parameters stillneed to be solved.To improve the accuracy of handwritten digit recognition, this paper applies support vectormachine to handwritten digit recognition and exploits a software system named SVM-HDR.On the basis of concluding the research of the people of the past, this paper puts the emphasison the factors that influence the performance of SVM classification. We flow process themethod of choosing the best factors to validate the effectiveness of support vector machine forhandwritten digit recognition. In addition, a method named virtual samples is reported in thispaper to introduce prior knowledge into handwritten digit recognition procedure, expecting topush the development of improving the accuracy of handwritten digit recognition based onsupport vector machine. The main work of this paper appears in the following aspects:⑴Analyzing and comparing the methods of multi-class classification.This paper mainly analyzes and compares three classification methods: one-against-all,one-against-one and Directed Acyclic Graph, and does experiments on concrete databases tocompare them in the recognition accuracy, training time and testing time aspects to find themost applicable multi-class classification method for handwritten digit recognition. Accordingto the results of the experiment, the one-against-one method is the most appropriate method.⑵Comparing and analyzing the training algorithms.Comparing the three main training algorithms: Chunking algorithm, Osuna algorithm andSMO algorithm in the speed, the accuracy and memory saving. After comparison, we find thatSMO algorithm has fast speed, high accuracy and need less memory, so it is appropriate to thesolution of large-scale problems. Therefore, the SMO algorithm is selected for handwrittendigit recognition.⑶Validating the effectiveness of support vector machine for handwritten digit recognition.This paper applies support vector machine to handwritten digit recognition to improve therecognition accuracy because of its advantages. We flow process the procedure by applying thebest factors in every step to SVM-HDR software system. This paper shows a series ofoperations on handwritten digit database MNIST including preprocessing of data, scaling, theselection of best kernel functions and kernel parameters, training and testing. Comparing theexperiment results with those using other techniques on the same database can validate theeffectiveness of SVM recognition method.⑷Proposing virtual samples method to introduce prior knowledge.Support vector set is a subset of training set, and it can stands for the whole information oftraining set on the whole, that is to say, only this subset works upon the solving ofclassification problems. Therefore, the testing performance of training another SVM on thesupport vector set, which is generated by training a SVM, is not worse than the result oftraining on the whole training set.On the basis of the system of validating the effectiveness of support vector machine forhandwritten digit recognition, abstracting the support vector set of training set using the theoryof some transformation of input image can't change the recognition results, and applyingtranslation invariance in this set to generate artificial support vectors. In the end, we can get avirtual sample set that is five times of the original support vector set adding the unchangedsupport vectors.Then retraining the virtual samples and testing the test set using the model generated byretraining. Through comparing the results with those of validating the effectiveness of SVMclassification method, we can find that the virtual samples method improves the accuracy ofhandwritten digit recognition based on support vector machine effectively, and the result isvery good.
Keywords/Search Tags:Support Vector Machine, handwritten digit recognition, multiclass classification, prior knowledge, SVM-HDR
PDF Full Text Request
Related items