Font Size: a A A

Research On Image Classification For Long-Tailed Data

Posted on:2024-11-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LiFull Text:PDF
GTID:1528307340977439Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Image classification is a fundamental research direction in computer vision,which typically relies on large datasets with balanced class distributions.By representation learning,deep neural networks can extract features from images and achieve accurate classification.However,in the real world,the collected data often exhibits a long-tailed distribution,where some categories(majority classes)have significantly more samples than others(minority classes).If deep neural networks are directly trained on such datasets,the networks tend to learn more effectively from majority class samples,resulting in weaker recognition capabilities for minority classes.To address this issue,existing research primarily employs optimized sampling strategies to balance the data distribution or adjusts the weight allocation during the training process to make the model pay more attention to minority classes.However,current research on long-tailed image classification still faces several challenges.Firstly,deep neural networks tend to map minority class samples into sparse feature clusters,which may cause their boundary points to easily cross the decision boundary.Secondly,as the decision boundary is closer to minority clusters,the model’s tolerance space may be reduced,thereby affecting its generalization ability.Thirdly,minority samples are not only prone to misclassification as majority classes but also suffer from insufficient distinguishability among themselves,further limiting the overall performance of the model.Fourthly,in practical applications,as the number of minority class samples increases,the decision regions delineated based on limited training data may fail to adequately capture these features.To address these challenges,this thesis conducts an in-depth exploration of the long-tailed image classification problem from the perspectives of deep feature learning and weight optimization.The main content of this thesis is as follows:Firstly,a feature cluster compression algorithm is proposed,which can establish a linear compression relationship at the feature level and further force the model to map minority samples into denser feature clusters during the training process,thereby pulling the boundary points back into the decision boundary to achieve performance improvement.Furthermore,this algorithm can be easily achieved without conflicting with model components,and it can be friendly combined with existing long-tail methods to further enhance their performance.This thesis applies this algorithm to over 30 existing methods,and the experimental results on four public benchmark datasets fully demonstrate its effectiveness and superiority.Secondly,an optimal feature distribution guidance algorithm is proposed,which includes two stages:(1)a feature cluster center compression algorithm is firstly proposed by improving aforementioned feature cluster compression,which realizes inward collapse-style compression of features,expands the tolerance space,and thereby achieves optimal position mapping of features;(2)based on the knowledge distillation framework,optimal distribution information is used to guide the training of the student model.The additional information can help the model map features more quickly and effectively with scarce minority samples,further improving the tolerance space.On the long-tailed benchmark dataset,this algorithm exhibits good generalization ability.Thirdly,an adaptive cost-sensitive learning algorithm is proposed to introduce a cost-sensitive matrix during the training process,which can set personalized penalty weights for misclassifications between different classes,thereby enhancing the recognition ability among difficult-to-classify classes.In addition,this thesis designs an automatic optimization algorithm for the cost-sensitive matrix,which can automatically update the matrix parameters based on data distribution characteristics and training status,further achieving more accurate and effective setting of penalty values.Extensive experimental results show that this algorithm not only improves the performance of minority classes,but also enhances the recognition ability of difficult-to-classify classes.Finally,a biased shortest distance criterion algorithm is proposed.In practical applications,it uses the shortest distance between the features and the centers of each feature cluster to reclassify minority classes.This process essentially uses the perpendicular bisector between the centers to replace the decision boundary for classifying samples,and allocates an equal decision region for each class.In addition,this paper proposes that during the calculation of the shortest distance,the center is shifted to make its perpendicular bisector closer to majority classes,thereby providing a larger decision region for minority classes to better cover minority features.Extensive experimental results show that,without the need for additional training,this algorithm can significantly improve the performance.
Keywords/Search Tags:Image classification, Long-tailed visual recognition, Deep clustering, Cost-sensitive learning, Knowledge distillation
PDF Full Text Request
Related items