The Research Of Similarity Metric In K Nearest Neighbor Classification And Fuzzy C Means Clustering

Posted on:2016-07-09

Degree:Master

Type:Thesis

Country:China

Candidate:J J Zeng

Full Text:PDF

GTID:2308330470973208

Subject:Computer technology

Abstract/Summary:

Pattern recognition was born in twentieth Century 20â€™s. With the appearance of computer in 40â€™s and the development of artificial intelligence in 50â€™s, pattern recognition plays a very important role in peopleâ€™s daily life and all walks of life in society. Therefore, many famous scholars from all walks of life explore and study the theories and methods of pattern recognition. At the same time, pattern recognition became a discipline rapidly in the early 60â€™s.There are two important themes in the research field of pattern recognition. They are classification and clustering. Classification and clustering have been applied widely in many fields. For the classification and clustering algorithms, to construct the distance measure or similarity measure is a very fundamental problem. Therefore, in order to ensure that the classification and clustering algorithms are better, the especial key step is to choose the appropriate distance measure or similarity measure.In this paper, following the basic idea of locality preserving projections(LPP) algorithm, we first construct a new similarity measure method, then we propose the new classification and clustering algorithms, which can reflect the internal structure characteristics of data. First, we give a brief overview of the classification and clustering. Second, we list some similarity measure methods which are often used currently in classification and clustering algorithms. Third, we introduce the K Nearest Neighbor(KNN) algorithm, Fuzzy C Means(FCM) algorithm and LPP algorithm in detail. LPP has attracted much attention in current. Last, following the basic idea of LPP, we improve the KNN and FCM algorithms. However, Euclidean distance treats all features equally. Mahalanobis distance considers the distribution characteristics of the data and it is not affected by the influence of dimension. But Mahalanobis distance exaggerates the function of tiny variable. Both Euclidean distance and Mahalanobis distance ignore the local intrinsic geometric structural characteristics of data. Aiming at this problem, following the basic idea of the LPP algorithm, we first make a detailed introduction on locality preserving scatter matrix and locality preserving within-class scatter matrix, then we use the scatter matrices to propose novel distance metrics, last we develop modified versions of classification and clustering algorithms. The modified versions of classification and clustering algorithmsâ€™ accuracy have been improved. We carry out experiments on real data, fitting data, face data and handwritten digit data. The experimental results based on cross validation and other experimental results show that the methods proposed are effective and feasible. Compared with the classification and clustering algorithms which are based on Euclidean distance and Mahalanobis distance, the proposed algorithms have better classification and clustering accuracy.

Keywords/Search Tags:

classification, clustering, Locality Preserving Projections, Mahalanobis distance

Related items

1	Research On Some Problems Of Locality Preserving Projections
2	The Research Of Distance Metric And Model Selection In K-Means Clustering And L2-SVM Classification
3	Algorithms Research On Face Recognition Based On Locality Preserving Projections
4	Face Recognition Based On Locality Preserving Projections
5	Research And Application Of Locality Preserving Projections Algorithm Based On Maximum Marginal Criterion
6	Parameter-less Supervised Kernel Locality Preserving Projection And Face Recognition
7	Researches On Gait Recognition Algorithms And System
8	Face Recognition Based On Parameter-less Two-dimensional Discriminant Locality Preserving Projections
9	Research And Application Of Robust Locality Preserving Projections
10	T-mixture Models And Extended Locality Preserving Projections For Clustering And Dimensionality Reduction