Font Size: a A A

Research On Key Technologies Of Weakly Supervised Object Detection

Posted on:2024-03-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:M ZhangFull Text:PDF
GTID:1528307079450584Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Object detection is one of the fundamental problems and research hotspots in the field of computer vision.It has important theoretical significance and application value,which has been widely used in many fields such as intelligent security,automatic driving,medical diagnosis and so on.With the widespread application of deep learning,object detection methods based on large-scale training data have made significant progress relying on fine instance-level annotation information.However,the process of labeling training data is expensive and time-consuming,which greatly limits the application scenarios of these methods.To this end,some researchers have proposed weakly supervised object detection methods,which only require image-level annotation information to train object detection models,thus greatly reducing the annotation cost.In the absence of instancelevel supervision information,how to use coarse-grained image-level labels to accurately locate and identify instance objects remains a difficult problem in weakly supervised object detection.It is urgent to research effective weakly supervised object detection models in the field of computer vision and even artificial intelligence,to reduce the cost of labeling and improve the accuracy of model detection.Therefore,this dissertation conducts the research on key technologies of weakly supervised object detection.In order to address the challenges and difficulties of weakly supervised object detection,this dissertation studies key technologies of network structure design,proposal optimization and network optimization with the overall goal of building a more efficient weakly supervised object detection model,and extends the above key techniques to contrastive learning across image instances.At the same time,the problem of pseudo-label mining under the condition of fewer annotations is explored.The detailed researches and contributions are summarized as follows:(1)Due to the lack of bounding box coordinate regression in the weakly supervised object detection network,this dissertation studies the network structure design based on online hierarchical association.This method first proposes an online associated detection model with multiple parallel branches,which combines the object region classification task and the bounding box regression task by embedding the regression hierarchical refinement module into the classification hierarchical refinement module.It also mines pseudo-instance annotation information from the previous layer of parallel branches,and uses the contextual information to supervise the training of the next branch.In addition,a object region hierarchical refinement module is proposed to gradually refine the detection results through multiple layers of object region detectors,continuously filtering out the inaccurate object regions among them,and then sensing the complete region of the target object to output more accurate detection results.(2)Due to the low quality and unbalanced distribution of proposals in weakly supervised object detection,this dissertation studies the proposal optimization based on recursive sampling transformation.The method iteratively optimizes pre-generated proposals during the training of the weakly supervised object detection network.First,a proposal self-transformation algorithm is proposed to convert the coordinate positions of proposals by iteratively generated object-aware coordinate offsets to improve the localization quality of proposals.Then,a proposal self-sampling algorithm is proposed to sample the proposals based on the confidence scores of the previous iterations.At the same time,the number of sampling proposals is gradually reduced in the iterative optimization to reduce the number of negative proposals.Finally,a decoupled proposal detection model is proposed to always generate new confidence scores and coordinate offsets for pre-generated proposals to reduce the error accumulation in iterative optimization.(3)Since it is difficult to distinguish multiple object instances in weakly supervised object detection,this dissertation studies the network optimization based on progressive instance learning.By transforming the optimization process of the network into a progressive instance learning process,this method first learns to detect a single object instance,and then transitions to detection learning for multiple object instances.To this end,a small amount of single-instance annotation information is introduced to guide the network in learning to detect complete object instances in single-object image scenarios,while bridging the feature learning gap between image-level and instance-level.Then,a spatial overlap-based instance mining algorithm is proposed to mine multi-instance object information from the detection results of single-instance level learning to supervise multi-instance level learning for correct localization in multi-object scenarios,thereby enhancing the multi-object detection capability of the network.(4)In order to solve the problem of weak instance feature representation in weakly supervised object detection,this dissertation studies the cross-image perception based on instance contrastive memory.The method assists in instance representation learning of the current input image by explicitly establishing semantic associations with other image instances through a contrastive learning mechanism.First,an instance diversity memory updating algorithm is proposed,which mines reliable instance representations from proposal features,stores them in a similarity-based memory bank,and uses multiple expression vectors to reflect instance diversity.Then,based on the instance representation information in the memory bank,a memory-aware instance mining algorithm is proposed to evaluate the completeness of proposals by computing the similarity with the stored instance representations to mine more reliable object instances.Meanwhile,a memoryaware proposal sampling algorithm is proposed to select more positive proposals based on the similarity of instance representations and eliminate some negative proposals with low similarity to alleviate the imbalance of positive and negative samples.(5)Furthermore,for the problem of pseudo-label noise interference in weakly supervised object detection with fewer labels,this dissertation studies the pseudo-label mining based on dynamic update self-training.The method dynamically updates and corrects the generated pseudo-labels during the network training process.First,a complexity-based image grouping mechanism is proposed to divide the unlabeled image data into different image groups from easy to difficult,and generate pseudo-labels starting from the simplest image group.Then,an intra-image group update mechanism is proposed to iteratively update the pseudo-labels in each image group.At the same time,an inter-image group update mechanism is proposed to update the pseudo-labels in all the considered image groups after the training of each image group.Furthermore,based on the confidence score weighted cross-entropy loss,a label reliability-sensitive loss is proposed to reduce the impact of noisy labels on network training.Moreover,a sample difficulty-sensitive loss is proposed to balance the contributions of difficult samples and easy samples,and gradually transition to the learning of difficult samples during the training process to enhance the detection ability of the network.Finally,this dissertation fully validates the effectiveness of the proposed method through extensive experiments on the widely used PASCAL VOC 2007,PASCAL VOC2012 and MS-COCO datasets,and the proposed key techniques have important theoretical significance and practical application value for future research on weakly supervised object detection methods.
Keywords/Search Tags:Weakly Supervised Object Detection, Network Structure Design, Proposal Optimization, Network Optimization, Pseudo-label Mining
PDF Full Text Request
Related items