Optical flow is an important research direction in the field of computer vision,and has broad application prospects in the fields of object tracking,pedestrian reidentification,and video compression.Optical flow aims to describe the motion information of the object in the visual domain,and it can be understood on the image as estimating the displacement of similar or identical pixel points corresponding to two consecutive frames of images.The endto-end optical flow estimation algorithm based on deep convolutional neural network is the main research method of scholars today.Since occlusion,large displacement,and fast-moving small objects have always been factors that affect the accuracy of optical flow estimation,most scholars’ research has focused on designing relevant algorithms to solve the above problems.This paper analyzes some classic optical flow estimation algorithms,which use the characteristics of the image pair to construct a correlation volumes pyramid,and search for the correlation volumes on each layer of the pyramid according to the optical flow information,and then continuously optimize the optical flow based on the searched correlation volumes update the network.The correlation volumes found in this way defaults to the same importance to the network in the initial stage.However,the analysis in this paper shows that this is not the case.Among the matching features of a certain feature of the previous frame image and the next frame image,the best matching features are only a few.Therefore,aiming at the inconsistency of the contribution of the correlation volumes found by the classic optical flow estimation algorithm to the network,this paper proposes a global correlation volumes attention network and a local correlation volumes attention network.The corresponding weight is assigned according to the degree,so that the network can notice the useful information in the relevant volumes,and improve the accuracy of network optical flow estimation.The main work and innovations of this paper are as follows:1.For the optical flow estimation method based on the transformation of all correlation quantities in the classical network,the importance inconsistency of the correlation volumes itself that expresses the feature matching relationship is ignored.This paper proposes a correlation volumes global attention mechanism optical flow estimation algorithm(CVA-Net),which assigns different weights to it according to the importance of correlation quantities,so that the network can perceive useful correlation volumes information faster,speed up network convergence and improve optical flow.estimated accuracy.First of all,in order to reduce the calculation amount of the network and link the correlation volumes and the optical flow more closely,this paper reduces the channel dimension of the correlation volumes and the optical flow through a convolution operation,and obtains the motion characteristics of the image pair.Then,based on the channel attention mechanism,this paper introduces the global attention mechanism into the calculation of correlation quantities,integrates the global channel information,and then assigns weights to each channel,that is,assigns weights to the correlation quantities between image pair features.The method proposed in this paper outperforms many classical algorithms on the optical flow estimation dataset(corresponding to Chapter 3of this paper).2.This paper analyzes the CVA-Net network proposed in Chapter 3,which has necessary channel dimensionality reduction operations in the global channel weight distribution of related quantities.This dimensionality reduction operation will lose channel information,that is,lose the correlation information between related quantities.Based on the above problems,this paper improves the existing attention mechanism,and proposes an optical flow estimation algorithm(CVLA-Net)of the relevant local attention mechanism.This algorithm uses a fixed-size 1D convolution to extract the current channel and its Correlation information between neighboring channels,and use this information to determine the weight value of the current channel.For whether the 1D convolution of extracting interactive information between channels shares weight values,this paper proposes two methods of CVLA-Net: local attention mechanism network based on weight sharing(CVLA-Net S)and weight-unshared based Local Attention Mechanism Network(CVLA-Net NS).The correlation volumes local attention mechanism adopted by the CVLA-Net network avoids the loss of channels caused by dimensionality reduction,and at the same time,its network parameters are reduced by 0.11 M(Million)compared with CVA-Net.More importantly,the two methods of CVLA-Net are comparable to or even better than CVA-Net in terms of accuracy of optical flow estimation as a whole(corresponding to Chapter 4 of this article). |