Font Size: a A A

The Study Of Fuzzy Fisher Based Clustering And Feature Dimension Reduction

Posted on:2010-05-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:S Q CaoFull Text:PDF
GTID:1118360302487750Subject:Light Industry Information Technology and Engineering
Abstract/Summary:PDF Full Text Request
Clustering analysis and feature dimension reduction are two important research topics in pattern recognition field. As an important unsupervised pattern recognition tool clustering analysis has been used in diverse fields such as data mining, biology, computer vision, document analysis. It aims to cluster a dataset into most similar groups in the same cluster and most dissimilar groups in different clusters. Feature dimension reduction including feature extraction and feature selection plays a very important role in pattern recognition. It helps to remove noisy features and reduce the dimensionality of original datasets.This paper is aimed at several issues based on fuzzy clustering and feature dimension reduction, including fuzzy Fisher criterion based semi-fuzzy clustering, unsupervised feature extraction and feature selection for imbalanced dataset etc. In this paper, the creative research results are:1 Fisher linear discriminant (FLD) is extended to fuzzy FLD and then a novel fuzzy clustering algorithm, called fuzzy Fisher criterion based semi-fuzzy clustering algorithm FBSC, is proposed based on fuzzy FLD. The proposed fuzzy clustering algorithm incorporates the discriminating vector into its update equations such that the obtained update equations do not take commonly-used FCM-like forms. Strictly speaking, the proposed fuzzy clustering algorithm here is rooted at both the fuzzy within-class scatter matrix and the fuzzy between-class scatter matrix, unlike most fuzzy clustering algorithms such as FCM are rooted only at fuzzy within-class scatter matrix. Thus, in the sense of fuzzy Fisher criterion as the objective function of the proposed clustering algorithm, FBSC can be viewed as a novel fuzzy clustering algorithm. In fact, this study also exploits a new application aspect of FLD.2 A method is presented to extend optimal discriminant plane feature extraction technology for unsupervised pattern. The basic idea is to optimize the defined fuzzy Fisher criterion function to figure out the first optimal discriminant vector and fuzzy scatter matrixes in unsupervised pattern. Based on these, the second discriminant vector which maximizes the fuzzy Fisher criterion function with the orthogonal constraint or the conjugated orthogonal constraint or both the orthogonal constraint and conjugated orthogonal constraint is obtained. Then this two discriminant vectors make up an unsupervised optimal discriminant plane (UODP), an unsupervised uncorrelated optimal discriminant plane(UUODP) or an improved unsupervised uncorrelated optimal discriminant plane(IUUODP) respectively.3. An extension of optimal set of discriminant vectors in unsupervised pattern is presented. The basic idea is to extend Fisher linear discriminant to a novel semi-fuzzy clustering algorithm through the defined fuzzy Fisher criterion function. With the proposed algorithm, an optimal discriminant vector and fuzzy scatter matrixes can be figured out and then unsupervised optimal set of discriminant vectors can be obtained. The experimental results demonstrate that although this method is unable to surpass traditional supervised optimal set of discriminant vectors, it has comparable performance with principal component analysis algorithm which belongs to unsupervised feature extraction. 4 A novel classifier-independent feature selection algorithm based on the posterior probability is proposed for imbalanced datasets. First, an imbalanced factor is introduced and computed by Parzen-window estimation. The middle point of Tomek links is chosen as the initial point. Accordingly, this algorithm is iterated to find out the boundary points which have the equality of posterior probability. Through the project computation on the normal vectors of these points, the weights of each feature can be obtained, which actually indicate the importance degree of each feature. The experimental results demonstrate that this proposed algorithm can not only reduce the computational cost but also overcome the shortcoming that the minority class may be ignored in the conventional feature selection algorithm.
Keywords/Search Tags:Fuzzy clustering, Feature dimension reduction, Unsupervised pattern, Fisher criterion, Fuzzy Fisher criterion, Feature extraction, Optimal discriminant vector, Optimal discriminant plane, Optimal set of discriminant vectors, Imbalanced data
PDF Full Text Request
Related items