Font Size: a A A

Research On Camouflaged Object Detection Algorithm By Aggregating Multi-scale Scene Context Feature

Posted on:2024-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2568307106475834Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Camouflaged Object Detection(COD)is a fundamental task in the field of computer vision,aimed at locating and segmenting objects with intrinsic visual similarities in texture,color,etc.to their surrounding natural or man-made environment.COD has both academic significance and practical value,and is widely used in practical scenarios in various fields such as computer vision,medicine,agriculture,and art.In recent years,with the rise of deep learning,COD algorithms based on deep learning have made breakthroughs in performance.However,due to the intrinsic visual similarities between camouflaged objects and the scene background,COD tasks face great challenges.Currently existing research methods have two main shortcomings: on the one hand,the local appearance features of the target are insufficient,making it difficult for the model to accurately locate and distinguish camouflaged objects in many challenging real-world scenarios(such as objects with highly similar colors,slender appearance,and ambiguous camouflaged targets);on the other hand,the feature aggregation of existing one-stage prediction models is insufficient,making it difficult for the model to deal with the high inter-class similarity and significant intra-class variation of camouflaged objects(such as changes in the size and shape of objects,blurry boundaries,and occlusion).To address these problems,this paper fully utilizes the scene context information rich in the multi-level features extracted by deep networks and designs a reasonable multi-level feature aggregation mechanism.On the basis of existing research,this paper proposes a camouflaged object detection by aggregating multi-scale scene contextual features.Specifically,the main contributions of this paper are summarized as follows:(1)To address the problem of insufficient local appearance features for object detection,this paper proposes a COD network that progressively aggregates multi-scale scene context features,considering the rich local-global contextual features in complementary cross-level features.First,the U-shaped Context-Aware Module(UCAM)is used to comprehensively mine the scene context information from local to global and from small to large scale in the multi-level features.Then,the Cross-level Feature Aggregation Module(CFAM)uses the progressive aggregation of residuals to fully capture complementary information between adjacent levels,gradually refining the precise prediction results from coarse to fine,effectively compensating for problems such as dilution of small object features,loss of local details,and lack of global semantic information guidance leading to target object ambiguity.The proposed network model was evaluated on four very challenging test datasets,CHAMELEON,CAMO,COD10 K,and NC4 K,and the evaluation results fully demonstrate the effectiveness and superiority of the model.(2)To address the problem of insufficient feature aggregation in one-stage prediction models,this paper proposes a lightweight COD network called Bi-level Recurrent Refinement Network(Bi-RRNet).Using a bidirectionally iterative refinement strategy,the Lower-level Recurrent Refinement Network(L-RRN)uses a Region-Consistency Enhancement Module(RCEM)to recursively refine the features from high-level semantic to low-level detail that are enhanced by the Multi-scale Scene Perception Module(MSPM)in a top-down manner.The Up-level Recurrent Refinement Network(U-RRN)also gradually polishes the refined features in the low-level recursive refinement network in a recursive manner,thereby aggregating all multi-level contextual features for accurate dense prediction.The MSPM uses a learnable global contextual vector to modulate the multi-scale contextual features generated by a filter group with different local receptive fields,which captures rich scene contextual information to alleviate intra-class significant changes.The RCEM uses high-level features with global region consistency information to guide the filtering out of cluttered information in the lower-level features,maximizing the inter-class contrast between the disguised object and its surrounding environment.Extensive evaluation on four highly challenging benchmark datasets,CHAMELEON,CAMO,COD10 K,and NC4 K,shows that the model performs well in terms of accuracy and parameter count,with a parameter count only half that of the state-of-the-art BSA-Net(only 14.95M).
Keywords/Search Tags:Camouflaged object detection, Scene context information, Feature aggregation, Recurrent Refinement Network
PDF Full Text Request
Related items