Font Size: a A A

Cross-Modal Sketch Retrieval Based On Self-Supervised Learning And Knowledge Distillation

Posted on:2024-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z X ChenFull Text:PDF
GTID:2568307136494914Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,sketch-based 3D shape retrieval(SB3DR)gets noticed in the computer vision community.The task is challenging due to the large domain gap between sketches and 3D shapes.Most existing methods solve the problem by supervised learning to extract discriminative features in the common feature space.However,these methods rely on the quality of training data annotations,which are costly in practical scenarios and not optimized for long-tail data.To rectify the problems,the thesis works on three main aspects of work:First,the thesis proposes a self-supervised learning approach for SB3DR(SSL).Motivated by the idea of instance discrimination,SSL regards the multiple views of a 3D shape as positive pairs and models the relation of multi-views in a contrastive self-supervised learning framework that requires only positive samples to extract features.Moreover,SSL constructs an enhanced triple loss to mitigate the gap between sketch and 3D shape domains.The experimental results on benchmark datasets show SSL approach can achieve the SOTA performance at a comparable level.Second,a sketch-based 3D shape retrieval approach via self-supervised long-tail optimization and cross-modal distillation(SSLTKD)is proposed.For the performance bottleneck of long-tail retrieval,the thesis proposes a 3D feature extraction method based on self-supervised long-tail optimization and constructs a cross-modal distillation architecture to complete the retrieval by distilling the 3D features extracted from the teacher network into the sketch modal student network to avoid the loss of high-quality prior knowledge in the lengthy cross-modal knowledge fusion process.The experimental results show that the SSLTKD method can achieve large performance gains on longtailed distributed datasets and outperform the current SOTA method on balanced distributed datasets.Third,a complete sketch-based 3D retrieval prototype system is designed and implemented by integrating the knowledge distillation and long-tail self-supervised retrieval methods within a Web browser.The system allows users to retrieve 3D shapes online by free sketching in the front-end,or select sketches within the benchmark dataset to retrieve 3D shapes offline,and record them based on user identity information and retrieved results.In the thesis,the design concept,functional modules,and architecture design of the system are elaborated,and functionality integrity and superior performance are tested and verified through rich sample cases.
Keywords/Search Tags:Cross-domain retrieval, Sketch, 3D shape, Self-supervised learning, Knowledge distillation
PDF Full Text Request
Related items