Font Size: a A A

Multi-Feature Fusion For Video Object Detection

Posted on:2020-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:X YueFull Text:PDF
GTID:2428330602950601Subject:Engineering
Abstract/Summary:PDF Full Text Request
The task of object detection is to identify the category of the object from the image and mark its size and position with bounding box.With the wide application of deep learning,the research on image object detection has made great progress and achievements.In recent years,people began to study object detection method for video.Video consists of a series of continuous images and there is a certain mapping relationship between the images,it provides more contextual and spatial information,meanwhile,the changes in video scene also increase many challenges,such as: object occlusion,motion distortion,lighting blur,etc.If we use the image object detection method for video directly,we will get a bad detection result and low detection efficiency.It is the focus of video object detection research that how to use the context information and redundant information provided by video to solve the problem of missed detection,false detection and inaccurate detection box in video object detection while ensuring the detection efficiency.According to the above description,this paper has done the following three works.Firstly,in order to solve the problem of missing partial details when extracting image features,this paper proposes a video object detection method based on twice feature fusion.The first feature fusion is to enhance the final representation feature of image by combining the feature maps of the shallow,middle and deep convolution layers when extracting the image features,the depth semantic information will be retained while adding the contour and texture information of the object.The second feature fusion is to combine the feature maps of the current image and its front and rear images,the purpose is to enhance the object characteristics of current image and complement the missing information when it is occluded,deformed,blurred,etc.Secondly,to reduce the influence of complicated background on object characteristics,this paper proposes a video object detection method based on attention mechanism.The selection function of attention mechanism is used to enhance the saliency of object characteristics in the image feature map,so that the model will focuse on the key information used for object detection.At the same time,the attention mechanism is used to design a artificial rule for feature fusion,which can reduce the calculation and improve the efficiency and accuracy of feature fusion.Thirdly,aiming at the mutual interference of similar objects in the detection process,this paper proposes a video object detection method based on multi-model feature fusion.The method uses multiple sub-models to learn the characteristics of similar objects separately,and then parallels the feature processing parts of the multiple sub-models,and uses the attention mechanism to select and synthesize the features for the whole model to perform secondary learning on the dataset.This practice can reduce the classification error of multi-category objects and improve the precision rate and the recall rate.The above three works of this paper have achieved better results than the original method on the publicized Image Net VID dataset,and the detection efficiency and accuracy have got different increase.
Keywords/Search Tags:object detection, video object detection, deep learning, feature fusion, attention mechanism
PDF Full Text Request
Related items