Font Size: a A A

Globality And Locality Incorporation In Distance Metric Learning

Posted on:2015-01-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:W WangFull Text:PDF
GTID:1268330428984398Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Metric Learning is important and fundamental in machine Learning. The metric distances provide a measurement of dissimilarity between different points and significantly influence the performance of many algorithms in machine learn-ing, such as κ-nearest neighbor classification, support vector machines, radial basis function networks and κ-means clustering. Due to the efficiency and scalability of linear metric learning, most effort has been spent on learning a Mahalanobis distance from labeled training data. To improve the classification performance and adapt to the multimodal data distributions, incorporating the geometric in-formation (i.e., locality) with the label information (i.e., globality) is of particular valuable and challenging. Therefore, in this thesis, our specific concern is:1) incorporating globality and locality in Mahalanobis distance without optimizing balancing weight (s);2) reducing the computational complexity. The following three stage research results were obtained.The First Stage:Discriminating Classes Collapsing for Globality and Lo-cality Preserving Projections. As a widely used metric learning method, Metric Learning by Collapsing Classes (MCML)[5] aims to find a distance metric that collapses all the points in the same class while maintaining the separation between different classes. This part attempts to combine the ideas behind Locality Pre-serving, Discriminating Power and MCML in a unified method. The proposed al-gorithm is convex and incorporates the globality and locality information without balancing weight (s). To further decrease the running time, some computation-ally intensive steps of the proposed method are mapped to a GPU architecture. Experimental results demonstrate the effectiveness of the proposed method.The Second Stage:Dependence Maximization based Metric Learning. The method proposed in the first stage has a complex objective function and the deriva- tion is difficult to be calculated. Therefore, this part proposes a general Maha-lanobis distance learning framework referred to as "Dependence Maximization based Metric Learning"(DMML) in a statistical setting. The main contributions of this part include:· DMML effectively incorporates two sources of information (i.e., globality and locality) in Mahalanobis distance without optimizing balancing weight (s).· Distinguished from classical dependence measuring criteria (e.g., Mutual Information and Pearson’s X2test), DMML focuses on using the criteria computed in RKHSs to avoid estimation or assumption of the data distribu-tions. Many existing kernel-based criteria can be incorporated into DMML to tackle the independence measurement problem.· Under DMML framework, two methods are proposed by employing Hilbert-Schmidt Independence Criterion (HSIC)[8] and generalized Distance Co-variance [28], respectively. They are formulated as convex programs and can be efficiently optimized by the first order gradient procedure.The Third Stage:Efficient and Scalable Information Geometry Metric Learn-ing. Although the methods proposed in the first two stages are convex problems, they are optimized by a gradient descent method. In contrast to most existing metric learning methods, Information Geometry Metric Learning (IGML)[24] can find a closed-form solution. This part propose two novel distance metric learn-ing algorithms to alleviate the limitations respectively.(1) The proposed method EIGML can reduce the computational complexity of IGML from O(d3+nd2) to O(nd).(2) IGML is infinite for singular matrices. Moreover, the geometric in-formation of data is lost in IGML. The proposed method SIGML can preserve both locality and globality. We emphasize that these two methods can find the closed-form solutions, leading to efficient optimization.Summary:The proposed method SIGML in the third stage includes the incorporation of globality and locality in the first two stages. SIGML can find the closed-form solution and avoid parameters tuning in the iteration solution. As a globality metric learning method, EIGML greatly reduces the computation complexity and can be applied to large-scale high-dimensional data.
Keywords/Search Tags:metric learning, Mahalanobis distance, convex optimization, closed-form solution, classification, dimensionality reduction
PDF Full Text Request
Related items