Font Size: a A A

Video Salient Object Detection With Motion Quality Compensatio

Posted on:2024-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:H S WangFull Text:PDF
GTID:2568307148962969Subject:Software engineering
Abstract/Summary:PDF Full Text Request
At present,the field of computer vision is developing rapidly,in which video Salient object detection(VSOD)is to imitate the human visual mechanism,and automatically identify the objects and regions of human interest by the computer.In the era of big data,video salient object detection can help people get the needed saliency information from massive video data more conveniently and quickly.The existing video salient object detection models generally adopt short-term methods,and only consider the current continuous limited frames to dynamically determine the balance of spatial and temporal saliency fusion,which leads to the inherent drawbacks of video salient object detection models.First,short-term methods that only consider a finite number of consecutive frames have a key limitation.It conflicts with the true mechanism of the human visual system,and short-term methods ignore spatio-temporal information in a larger scale.Second,when cross-modal spatio-temporal feature fusion is performed to generate the final saliency map,the human visual mechanism is more likely to be interested in moving objects,but few people pay attention to the influence of motion information on the performance of the model.Therefore,the results of the existing video salient object detection models continue to fail,and the method of performing VSOD and the quality of motion information become the main technical bottlenecks to improve the performance of video salient object detection models.In order to improve the performance of video salient object detection model,this paper proposes the following two solutions:1)In this paper,a new method of optical flow calculation based on UFlow is proposed.The improvement of UFlow optical flow calculation method mainly includes two parts: First,the traditional optical flow calculation method is not stable enough,especially in the face of a short period of stationary objects,which cannot generate high-quality optical flow images.By expanding the perception range of optical flow,the short period of stationary objects can be transformed into the motion speed suitable for optical flow calculation.It can be seen that it is easier to obtain the current motion state of the salient object by calculating the optical flow map of the current video frame and the video frames in the past and future.Expanding the perception range of optical flow is a long-term optical flow calculation method,which solves the shortcomings of traditional optical flow calculation methods that cannot obtain the motion state of the salient object when it remains stationary for a short time.Second,add the optical flow quality perception module.The distance between the current frame and other frames in the range is different when the optical flow is expanded,and the static object is converted into different motion speeds when calculating the optical flow,so that multiple optical flow maps with different quality of the current frame can be generated.The high quality optical flow map is selected as the optimal optical flow map of the current frame through the optical flow quality perception module,which greatly improves the quality of the optical flow.Thus,the fusion effect of spatio-temporal features of the VSOD model is improved.2)The current VSOD models widely adopt the short-term method,that is,only the spatial information and motion information provided by the current continuous finite frames are considered,and the spatial and temporal features of the two are simply fused to obtain the final salient map.In view of the disadvantages of using short-term methods in existing VSOD models,a new VSOD method is proposed in this paper.This method executes VSOD in a complete long-term way,that is,the traditional video salient object detection is transformed into a data mining problem.Firstly,all object boxes containing objects in the video sequence are obtained.Then,the salient object box is mined in a way from easy to difficult,and the salient object is mined to guide and train the network.In this process all object boxes are available at the same time,so it is a complete long-term detection method.The new UFlow method for calculating optical flow proposed in this paper can effectively improve the quality of motion information,and based on high-quality motion information,can improve the performance of spatio-temporal feature fusion.In addition,different from the model using short-term methods,this paper proposes a long-term spatio-temporal information mining method for video salient object detection.The performance evaluation was carried out on five widely used benchmark data sets with widely used evaluation indexes,and the qualitative and quantitative comparison analysis was carried out with the mainstream methods.The experimental results show that the proposed method can effectively improve the optical flow quality and improve the accuracy of video salient object detection.
Keywords/Search Tags:Saliency detection, Motion information, Motion quality, Long-term approach
PDF Full Text Request
Related items