Font Size: a A A

Research On Feature Extraction And Fusion Using Subspace Learning

Posted on:2017-12-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:S WangFull Text:PDF
GTID:1318330512471771Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In real application of pattern recognition,the dimension of data is usually very high,such as face recognition,handwriting recognition and image clustering.Further,each data contains lots of null information and noise which will degrade their performance.Feature extraction which is a common scheme for getting the low dimension representation becomes a hot topic and attracts lots of the attention of scholars.As the development of data acquisition technology,plenty of data of different characteristics are collected.Feature fusion which aims to extract feature from data of different characteristics becomes another hot topic.Among all the methods of feature extracting and feature fusion,subspace learning methods are one of the most important categories.Thus,our research will focus on subspace learning for feature extraction and feature fusion,and we will propose four subspace learning methods for classification and clustering.The main work can be listed as follows:?1?A method named semi-supervised linear discriminant analysis will be proposed for feature extraction.As we all know,the label of all data will be needed when we adopt linear discriminant analysis?LDA?to extract the feature.However,in real application,there are lots of unlabeled data,so LDA is hard to be used under such case.To solve such problem,we pro-pose a method named semi-supervised linear discriminant analysis?SLDA?,which can utilize limited number of labeled data and a quantity of the unlabeled ones for training.In SLDA,we introduce an objective function which can be used to compute projected vectors and the label information,and also minimize the distance between the computed label indicator matrix and the true indicator matrix.Further,non-negative constrain and orthogonal constrain of label in-dicator matrix are used to increase the accuracy of computed label indicator matrix.To solve the objective function,we adopt an alternative optimization strategy to compute projected vectors and the label of each data.?2?We will propose a method named canonical principal angles correlation analysis?C-PACA?.Canonical correlation analysis?CCA?needs the paired information of the data of two views and cannot utilize these unpaired data.Further,it does not consider the nonlinear struc-ture of the data.To avoid these two problems,we propose canonical principal angles correlation analysis.In CPACA,we utilize multiple principal angles to compute the correlation between two views,which does not need the paired information between two views.To maintain the nonlinear structure of the data,we introduce manifold regulation to constrain the distribution of the data.Therefore,the projected vectors can be got by maximizing the correlation between two space,and maintaining the nonlinear structure of the data.?3?We will propose a method for multi-view data named unsupervised discriminant canon-ical correlation analysis based on spectral clustering?UDCCASC?.CCA just considers the corre-lation between paired data,and ignores the correlation of data in the same class.Thus,it does not completely exploit the discriminant information of two-view data.To solve such problem,we propose unsupervised discriminant canonical correlation analysis based on spectral clustering for clustering the multi-view data.In UDCCASC,we employ normalized spectral clustering to cluster the projected multi-view data,which will provide cluster membership for next iteration.Then,we can utilize these cluster membership to exploit the discriminant of these multi-view da-ta.In this algorithm,we roughly category the correlation of the same class into three categories:?1?the correlation of the same class between paired data;?2?the correlation of the same class cross views;?3?the correlation of the same class within views.To balance these correlations,we introduce a strategy to compute the weight of each category.?4?We will propose a method named canonical correlation analysis based on L1 norm?CCA-L1?.By simple algebraic derivation,we know the objective function of CCA equals with minimizing the L2 distance between projected paired data.Thus,CCA is based on L2 norm in nature.Apparently,for CCA,the pair data of small distance are more important.However,L2 norm will give large weight to the paired data of large distance and give small weight to the paired data of large distance.Therefore,L2 norm will result CCA in bad performance.To alleviate such problem,we propose a method named canonical correlation analysis based on L1 norm minimization?CCA-L1?for feature fusion.To optimize our objective function,we propose a method to optimize the objective function.Further,we will give three extensions of CCA-L1.
Keywords/Search Tags:canonical correlation analysis, feature extraction, subspace learning, linear dis-criminant analysis, semi-supervised learning
PDF Full Text Request
Related items