Font Size: a A A

Robust Hierarchical Feature Reduction Based On The Class Relation

Posted on:2021-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:X X LiuFull Text:PDF
GTID:2428330629980601Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the era of big data,massive samples,large-scale categories,and high-dimensional features bring abundant information to machine learning.In addition,categories are usually managed by complex structures such as the hierarchical structure;and the inevitable noise data reduces the quality and availability of data.These data characteristics bring severe challenges to the traditional methods of feature reduction in machine learning:(1)the large number of features lead to the problem of a curse of dimensionality;(2)the complex class hierarchy breaks the assumption of mutual independence between categories for traditional dimensionality reduction method;and(3)the low-quality data destroys the basic assumption of high data quality for traditional methods.These result in a problem that traditional methods are not effective or not applicable when dealing with large-scale classification tasks.In this thesis,we mine the class relation in the hierarchical structure of categories,and study robust hierarchical feature dimensionality reduction method for the complex large-scale classification task with a hierarchical class structure and low quality data.Three methods are mainly studied as follows:1)Hierarchical feature extraction based on class dispersion.Traditional feature extraction method ignores the complex hierarchical relationship between categories.Some existing dimensionality reduction methods for each level of category hierarchy are easy to result in significant classification errors.This method decomposes the hierarchical classification tasks with nodes as units,then defines the inter-class and intra-class dispersion matrices for different granularity tasks.Finally,a hierarchical feature extraction method based on class dispersion is proposed based on discriminant analysis.2)Robust hierarchical feature selection based on category similarity constraint.Traditional feature selection method ignores the relationship between categories.Meanwhile,the general feature dimensionality reduction method is not robust enough.A fine-grained child classification task should have certain similarity with the upper coarse-grained task due to that it is included in the upper one.Therefore,a relationship constraint of similarity between categories can be defined according to it.Then a capped least square loss function is adopted to filter the outlier and noise data.Finally,a robust hierarchical feature selection method with similar relation constraints is put forward.3)Robust hierarchical feature selection based on class center generalization constraint.Traditional feature selection method ignores the relationship between numerous categories,and the general feature dimensionality reduction methods are not robust enough.It is considered that the current coarse-grained classification task includes all the fine-grained classification tasks of its subordinates,and the coarse-grained classification number is the generalization of its subcategories,so it should be close to the center of its subcategories.Then,a central generalization constraint between categories is defined according to this idea,and a robust capped hinge loss function is used to filtrate data noise.Finally,a robust hierarchical feature selection method based on class center generalization constraint is proposed.
Keywords/Search Tags:Hierarchical classification, curse of dimensionality, feature extraction, feature selection, class relationship
PDF Full Text Request
Related items