Rain weather seriously affects the quality of video and image captured by outdoor visual monitoring system.Rain marks in the air block the background scenes of images,resulting in blurred imaging.And the execution of various advanced visual tasks such as target detection,image segmentation and automatic driving all require high quality image input.In order to ensure the operation of these advanced vision systems,image deraining has become an important preprocessing step in outdoor monitoring systems,which helps image transmission more accurate detection and recognition results.Image deraining task can be divided into video image deraining and single image deraining according to the different number of image frames processed.Video deraining is relatively simple by using video sequence time redundancy and frame difference to remove rain marks.Image deraining captures the details of rain through the information of adjacent pixels,but due to the complexity of spatial information and background scenes,this process becomes extremely difficult.At present,single image deraining method can be divided into two main categories:traditional method and deep learning method.The deraining method based on the traditional method establishes prior assumption conditions on the rain layer and the background layer.The application conditions are single,and it consumes time and energy,which cannot remove rain marks from the complex and diverse rainfall images.Based on the deep learning method,deraining is taken as the process of nonlinear function learning,and appropriate parameters are explored to remove rain layer from the background scene.Compared with traditional methods,deep learning-based methods have gained wide attention due to its strong feature representation ability and achieved good performance,but there is still a lot of room for improvement.At present,the shortcomings of the deep learning-based single image deraining method are as follows:(1)Single feature extraction method,different features of convolutional and Transformer feature extraction capabilities are ignored,and the advantages of both are not considered to better extract rich and accurate rain features.(2)The current method pays more attention to the result of image rain removal,and elaborate design of rain removal module leads to relatively complex algorithm,which requires a large number of parameters and cannot meet the requirements of high quality and efficiency at the same time.There are problems such as long processing time and poor real-time performance.In order to solve the above problems,this thesis carries out a series of studies,mainly including the following two aspects:(1)Aiming at the single feature extraction method,this paper proposes an end-to-end image rain removal method based on double branch lightweight(LBFNet).The network uses two different branches for feature extraction.CNN branch extraction extracts local features to compensate Transformer branch for the deficiency of inductive bias,and Transformer branch captures long dependencies to learn global information to refine weak texture details.The Dual Stream Feature Fusion module(DSFF)is used to continuously fuse the intermediate features of two independent branches,complement each other,and extract rich features to guide image reconstruction.In this paper,a multi-scale lightweight single-image rain removal method(MLKNet)based on large kernel convolution is proposed.In this method,a Multi-scale Attention Extraction Block(MAEB)is proposed by combining multi-scale and large kernel convolution.Large kernel convolution is used to obtain rich attention graphs at different granularity levels,and large receptive field convolution is used to extract global and local information.Perceive more effective pixels in spatial dimension.This paper proposes a Channel Attention Feedforward Network(CAFN)to focus on important channel information and refine rainwater characteristics.(2)To solve the lightweight problem,in LBFNet network,complex Transformer decoder blocks are removed and only dual-branch encoder combined with CNN and Transformer is used.Firstly,based on the idea of lightweight network Ghost,the common convolution is implemented using Ghost module to reduce the feature redundancy graph and reduce the amount of convolutional calculation.Secondly,recursive operations are introduced in Transformer branch to share the weight between different Transformer blocks.Increasing network depth reduces GPU consumption and model parameters.In MLKNet network,large kernel convolution can be decomposed,reducing the amount of computation and parameter caused by large kernel convolution.In addition,in the process of network training,we propose a simple self-attention distillation method,which allows the image rain removal task without adding supervision module and a large number of additional parameters,the attention diagram of adjacent layers learning shallow features learning the expression of high-level features,the attention diagram of each layer of the network is strengthened,reducing the number of network parameters and speeding up the network training.This allows our network to train smaller networks to achieve deeper network performance.The two algorithms in this paper are tested on the public dataset with six deep rain removal algorithms on the public dataset and the real rain dataset after training,and it is demonstrated that the algorithms can effectively remove not only synthetic rain lines but also suppress the interference of real rain lines.Meanwhile,the PSNR and the number of parameters of the various algorithms are visualized,and it is demonstrated that the proposed algorithms achieve a good balance between the number of parameters and the rain removal performance,and realize the light weight of the model. |