Font Size: a A A

Balancing Knowledge Transfer And Expression Feature Fusion Research Based On Metric Learning

Posted on:2024-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:J Y YuFull Text:PDF
GTID:2568307139496294Subject:Engineering
Abstract/Summary:PDF Full Text Request
Deep metric learning is a long-term research focus in the field of computer vision,aiming to obtain a deep neural network representation embedding model through deep learning techniques,which can convert images in a high-dimensional space into a low-dimensional embedding space represented as feature vectors.In this embedding space,images with semantic similarity are close to each other,while those with different semantics are far apart.Deep metric learning has a wide range of applications in image retrieval,face recognition,and selfsupervised representation learning.This article focuses on deep metric learning techniques and mainly explores the research of embedding model-based knowledge transfer techniques(embedding transfer),as well as how to apply deep metric learning techniques to facial expression recognition to improve recognition accuracy.(1)Balanced knowledge transfer research for embedding models.Embedding transfer is to transfer the knowledge learned in one embedding model(teacher)to another embedding model(student)to improve the performance of the model or reduce the size of the model and embedding space dimensions.This article proposes a novel balanced embedding transfer method: Balance Embedding Transfer(BET).Unlike previous methods,BET divides the knowledge obtained in the embedding model into two categories: intra-class compact knowledge and inter-class separable knowledge.We observed that during knowledge transfer,inter-class separable knowledge is far more than intra-class compact knowledge,and this imbalanced knowledge transfer prefers inter-class separable knowledge,which inhibits intra-class compact knowledge and ultimately affects the performance of the student embedding model.Therefore,we designed a novel balance perception loss function for BET,which can control the relative influence of intra-class compact knowledge and inter-class separable knowledge on embedding transfer.BET is conceptually simple,easy to implement,and applicable to any knowledge representation based on paired sample relationships.Through extensive experiments on standard benchmark datasets,we verified the effectiveness of BET and its performance surpasses that of the latest methods.(2)Facial expression recognition feature fusion based on multi-threshold metric learning.Triplet-based deep metric learning technology can construct effective facial expression recognition features for the task.However,the performance of the traditional triplet loss function is greatly affected by the threshold,and the optimal threshold has significant differences between different datasets or different categories of the same dataset,making it extremely challenging to determine the optimal threshold.This article proposes multithreshold deep metric learning technology,which not only effectively avoids threshold verification but also significantly improves the performance of triplet loss learning.We found that in the triplet loss function,different thresholds essentially determine a unique distribution of inter-class variations.Therefore,this method no longer selects a single optimal threshold,but comprehensively samples a set of thresholds within the threshold range to fully extract and utilize the different expression feature characteristics corresponding to different thresholds.We divide the embedding layer of the deep metric network into a group of embedding slices and model the training of these embedding slices as an end-to-end multi-threshold deep metric learning process.Each embedding slice corresponds to a sampled threshold,and the corresponding triplet loss function is applied during the training process to supervise learning.Finally,each embedding slice corresponds to a unique expression feature,and we fuse these features through multi-threshold learning to improve facial expression recognition accuracy.
Keywords/Search Tags:Deep metric learning, Deep Embedding Model, Embedding transfer, facial expression recognition
PDF Full Text Request
Related items