Font Size: a A A

Research On Video Structured Technology Based On Object Detection

Posted on:2021-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z D LiFull Text:PDF
GTID:2428330623968265Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of network technology and the Internet of things,a large number of videos are produced at every moment in our daily life.Although video can express information vividly and intuitively,it consumes a lot of storage space and it does not have structural characteristics,which makes their storage and retrieval very difficult.At the same time,the development speed of existing video compression technology is far behind the growth speed of video data,which will lead to high storage costs.Therefore,it is an urgent problem to propose a video analysis technology that can automatically acquire key information and save storage space.Video structured technology refers to extracting key information of different levels from video through algorithms in image processing technology,text analysis technology and other fields,and making corresponding semantic description for key information of different levels.Finally,through video standardized description,key information and corresponding semantic information are stored structurally,which is convenient for recording and retrieving.This paper combines the deep learning technology and the traditional algorithm,and studies some key technologies in the video structured technology.The main contents are as follows:(1)This paper proposes a novel video structure method which combines traditional methods and deep learning.This method mainly involves key frame extraction,object detection,action recognition,scene recognition,image caption and other technologies,which makes the information in the video can be effectively expressed,and can generate corresponding description sentences for each image,making it easier to store and retrieve,greatly enriching the content of structured information.(2)This paper makes full use of the moving target information in the video,and proposes a key frame extraction method based on the moving target information.This method obtains a more comprehensive and robust feature through the weighted fusion of the information of frame difference,HSV color information and motion vector.Then the threshold is set by the adaptive threshold algorithm to select the key frame of the video frame.Finally,the final key frame is selected by comparing the target information of the primary key frame with the object detection technology.This method makes full use of the target information of the video and combines with the deep learning technology to provide a new key frame extraction method.(3)This paper optimizes the structure of the YOLOv3,and proposes a method to enhance the effect of object detection.In the aspect of neural network structure,this paper extracts the features of the backbone network through the dilated convolution of different dilated rates,which makes the feature layer have multi-scale information,and can better identify the targets of different scales.In the aspect of enhancing the detection effect,this paper screens out the region of interest with poor detection intensity through the projection counting algorithm,and then makes the target match the best detection scale of the network model as much as possible,then sends it to the neural network again for detection,integrated multiple detection results,making the target detection results more accurate.
Keywords/Search Tags:video structured, key frame extraction, object detection, semantic information
PDF Full Text Request
Related items