Font Size: a A A

The Study Of Semi-supervised Dimensionality Reduction In Handwriting Identification

Posted on:2013-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y X DiFull Text:PDF
GTID:2268330392465642Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Handwriting identification is a scientific and technique that judges the identity of the writeraccording to the handwriting style. It is non-invasive to get the handwriting and easily acceptablefor most people. Handwriting has been widely used to identify human identity in the judicial,financial, archeology, public security, insurance and other fields. With the development ofcomputer technology and artificial intelligence, handwriting identification, as a biometrictechnology, has made tremendous progress. In these years the handwriting identification hasbecome a hot research topic in the field of computer vision and pattern recognition.In practical application, we often obtain a large number of samples and very small amountsof unlabeled samples. Traditional supervised learning requires a large number of labeled sampleswell the un-supervised methods only use the unlabeled samples ignoring the labeled ones.Therefore the semi-supervised learning, learning from labeled and unlabeled samples, hasbecome a new research topic in the field of machine learning. Now the semi-supervised learninghas extended to the field of semi-supervised regression and semi-supervised dimensionalityreduction from the semi-supervised classification and semi-supervised clustering. Comparedwith the semi-supervised classification, semi-supervised clustering and semi-supervisedregression, the research on the semi-supervised dimensionality reduction is still relatively rare.This paper studies the semi-supervised dimensionality reduction algorithms in handwritingidentification; the main work is as follows:Firstly, the handwriting images were preprocessed, then normalized handwriting textureimage formed finally. This paper is based on texture analysis methods, using improvedmulti-channel Gabor wavelet to extract the texture features of the handwriting images. TheGabor kernel function is taken40channels, for each channel extracting its mean and variance asthe last feature, so a80-dimensional feature vector is obtained for each handwriting image as thefinal characteristic.Secondly, in this paper a new semi-supervised dimensionality reduction algorithm calledGeodesic distance based semi-supervised locality dimensionality reduction (GSLDR) wasproposed for the handwriting data by analyzing the advantages and disadvantages of the existingsemi-supervised dimensionality reduction algorithms, combined with the characteristics ofhandwriting identification data. The algorithm used the geodesic distance instead of theEuclidean distance, which can not reflect the structure of the data. The algorithm expanded thepairwise constraints, which strengthen the guiding role of the constraints in the dimensionalityreduction, and then the constraints were added to the nearest neighbor graph, which made thegraph can reflect the more realistic manifold structure of the data. Finally, the experiments were done in the Matlab environment; the150handwriting imagesfrom15different people,10for each person, were divided into test set and training set.5handwriting images from each person total75images were used as training set, the remaining ofthe images were used as test set. To reduce the dimension of the eigenvectors of the test data set,using the algorithm proposed in this paper and other semi-supervised dimensionality reductionalgorithms respectively.We use the k-nearest neighbor classifier to fulfill the task of identification. Theexperimental results verify the validity of the semi-supervised dimensionality reductionalgorithm proposed in this paper in the handwriting identification.
Keywords/Search Tags:semi-supervised dimensionality reduction, pairwise constraints, handwriting identification, feature extraction, multi-channel Gabor wavelet
PDF Full Text Request
Related items