Font Size: a A A

Research On Dataset Distillation Algorithm For Face Image Classification

Posted on:2024-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:X G ZhuangFull Text:PDF
GTID:2568307079455494Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The development of deep learning techniques has made face images more promising for applications in various fields.The development of face image applications is inseparable from numerous face datasets,which have an increasing number of samples,increasing resolution,and increasingly complex data sources,providing a wealth of knowledge for deep learning models.However,such large-scale data require numerous processing steps and cost a lot of storage,bandwidth and computational resources,which also make it more difficult and resource-intensive for neural network model training.Sharing datasets is a cost-effective option to improve utilization,but it also faces many issues such as privacy,data quality,copyright,ethics,organization,and management,which hinder its implementation.In light of these challenges,researchers have proposed a new technique,known as dataset distillation,to compress the knowledge in the original dataset by constructing small synthetic datasets.This approach can speed up network model training,reduce dataset storage space and transmission bandwidth consumption,provide some visual privacy effects,and circumvent possible copyright and ethical issues.Dataset distillation method is implemented for generic datasets,and there are also problems of data redundancy,different feature focus,and different complexity for application on human faces.In order to construct more compact and efficient face image datasets,this thesis studies the application of this method on face images,and proposes three methods to improve the performance of face-generated datasets according to the features of face images and the drawbacks of dataset distillation techniques.In order to solve the image information redundancy problem of distilled datasets,this thesis proposes a multi-generator dataset distillation framework from the similarities and differences of face features.The framework consists of multiple sets of hidden space parameters and generators to extract the difference and identical information respectively,replacing the generating set representation in the original dataset distillation method.It is experimentally demonstrated that this method greatly reduces the total number of parameters required to represent the dataset while maintaining the performance of the generated dataset.In order to solve the problem of localization of face image features and differences in the importance of features,this thesis proposes a teacher migration attention module.This module uses a pre-trained teacher model to migrate the trained attention network module to the student network,and the features of the generated images are maintained or weakened to guide the training of the generated set.Experimental results show that the generated set using this module has better performance.To solve the problem of multiplicity of face attributes,this thesis proposes to provide soft labels using a teacher model.Unlike the previously proposed learnable soft labels,this method does not require additional optimization difficulty and improves the performance of the generated set.Experiments are conducted to compare and validate the rationality of the soft label sources and the improvement of the performance of the generated set by this method.The present study proposes a methodology aimed at simultaneously reducing the parameter count of the facial distillation dataset and enhancing its quality.Experimental results demonstrate a certain improvement in the quality of the distilled dataset.
Keywords/Search Tags:Dataset Distillation, Face Dataset, Image Generation, Attention Network, Soft Labels
PDF Full Text Request
Related items