Font Size: a A A

Research On Key Technologies Of Fine-grained Image Recognition

Posted on:2022-05-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:C B LiuFull Text:PDF
GTID:1488306323462804Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Fine-grained image recognition aims to distinguish different subcategory objects under the same category.It exhibits extensive values in many applications.For exam-ple,in the field of E-retail,it is important to distinguish different brands and types of goods with similar packaging;in the field of medical imaging,it is crucial to distinguish between benign and malignant diseases and different subtypes;in the field of smart city,it is crucial to identify and count different similar cars and pedestrians.Meanwhile,fine-grained images recognition also faces many technical challenges.The objects being fine-grained recognized come from the same large category with sim-ilar shapes and textures,and the identification information is hidden in the local area,which requires us to accurately localize the identifying objects;The discrimination in-formation of fine-grained recognition is very subtle,which requires that feature extracter must be sensitive and robust;Fine grained recognition is oriented to the actual natural scene,which requires the model to have the ability of anti-interference,and should be flexible and lightweight.In view of the demanding task difficulty and important application value of fine-grained image recognition,this thesis focuses on the key technologies of fine-grained image recognition.The main thesis is listed as follows:1.Object localizationFine positioning of the target can provide accurate input data for the recognition system,which is the basis of this thesis.Generally speaking,target localization task can be divided into two kinds,one is the localization of key points,the other is the localiza-tion of local areas.For the research of key point localization,traditional method uses convolution neural network layer by layer to extract key point features,which ignores the global structure dependence between key points,and obtains poor accuracy when the points are dysmorphic.To solve this problem,this thesis firstly utilizes Faster RCNN to capture local dependence to localize the key points.Besides,this thesis proposes a key point localization method which combines local and spatial dependence with U-Net.U-Net is used to capture the local texture features of key points,and a lightweight non-local module is designed to capture the global structure of key points.Finally,com-bined with the task of diagnosing developmental dislocation of the hip in children,this thesis achieves good performance in localizing the key points of deformity,and then achieves reliable clinical diagnosis.For the research of local region localization,tradi-tional methods require a lot of professional annotation,which limits the applicability of the method.To solve this problem,this thesis proposes a weakly supervised localiza-tion method based on visual consistency,which can localize key areas independently without relying on fine annotation.Finally,when applying the proposed technique into the task of pediatric bone age assessment,it achieves good performance with a high accuracy.2.Feature extractionExtracting fine-grained visual features based on input data is the key part of this thesis.Firstly,a feature extraction method based on local self-consistency is proposed According to the consistency of attention and recognition,the feature extractor is op-timized,and then the global features and multiple discriminative regional features are concatenated to obtain fine and comprehensive visual features.Secondly,the traditional attention mechanism adopts the isolated feature extraction method for the local region,and regards the local feature learning task as the classification task,which ignores the constraint of the global feature on the local feature.To solve this problem,this thesis de-signs a global-local correlation constraint method,which utilizes knowledge distillation method to transfer global knowledge to local region,and introduces global knowledge constraint in the process of local feature extraction.Thirdly,we further explore the data augmentation method of fine-grained recognition task.The traditional data aug-mentation method imposes noise on the global image.However,fine-grained image recognition pays more attention to the local region feature learning.In order to solve this problem,this thesis designs a data augmentation method of local area disturbance,which can erase the information of some local areas of the image,and effectively im-prove the ability of network feature extraction.3.Feature ameliorationAmeliorating the extracted feature for improving recognition is the further promo-tion of this thesis.The features extracted by an extractor are often redundant and filled with noise,which affects the performance of fine-grained classifier.In order to solve this problem,this thesis designs a lightweight soft attention method to strengthen the discriminative features and suppress the redundant features.In order to keep the diver-sity of attention,this thesis designs hidden attribute for different attention.At the same time,comparative experiments also prove that the proposed method only incurs a small computational cost,which ensures the practical application value of the method.Finally,this thesis verifies the proposed methods on the scientific task of fine-grained recognition,as well as the practical application of pediatric orthopedic medi-cal imaging analysis.This thesis conducts a systematic and targeted study on the key technologies of fine-grained image recognition,and makes solid achievements in both scientific research and practical application.
Keywords/Search Tags:Computer Vision, Fine-grained Image Recognition, Attention Mechanism, Medical Imaging, Feature Extraction
PDF Full Text Request
Related items