Font Size: a A A

Research On Deep Temporal Feature Modeling In Irregular Video Datasets

Posted on:2023-09-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:D S LiFull Text:PDF
GTID:1528306902455484Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Video object detection is used in many occasions more broadly,and has important research significance and practical application value.This thesis experimented some classic image and video detection algorithms on our professional rice disease and pest video datasets repeatedly,and found that the classic algorithms had a not very high detection accuracy on the dataset.The reason is that the rice dataset’s object(such as lesions)is more irregular than other datasets,and the foreground and background of the frame are complex.Moreover,video detection itself has difficulties such as object motion blur,object part occlusion,object deformation,and video defocus,etc.Therefore,video detection is more difficult and more challenging than image detection.To address the above-mentioned issues,this thesis proposes a research method of the deep temporal feature modeling of video sequences,and models the video sequences with deep feature and temporal feature respectively.The deep feature modeling can enhance the extraction of deep semantic features within a frame;and the temporal feature modeling can enhance the usage of the temporal information features among video frames,which is another effective method.The main work and contribution are as follows:(1)The video temporal information and its mathematical equations are defined,the existing video object detection methods are summarized.Extracting the temporal information among frames and extracting the key frames are one of the directions of video detection is illustrated.The characteristics of video,video temporal information,video object detection,and the reasons of object motion blur,object occlusion,object deformation,and video defocus are analyzed.(2)For the issues of low average detection accuracy and high recognition error rate of rice diseases and pests video detection,a feature extraction method based on Deep Convolutional Neural Network(DCNN)is proposed by referencing the traditional method of extracting deep semantic features.The performance of rice disease and pest detection is enhanced.Compared with some well-known algorithms,the proposed deep neural network method reaches advanced detection results and better generalization ability.(3)A step-type data sampler structure is proposed,which enhances the ability of the LSTM model to utilize the temporal feature information among frames,and implements the temporal feature modeling.The proposed step-type data sampler can accelerate the convergence of the LSTM model,thereby improving the learning speed of the LSTM model.Meanwhile,the proposed method is extended to the UCF101 human action recognition dataset to implement the video detection experiments,and the results verify the generalization ability of the method.Experimental results show that the data sampler can improve the video detection accuracy.(4)A LSTM model including ResNet-50 deep feature modeling and the ContextLSTM temporal feature modeling is proposed,and the information transmission loss in model training is proposed.Detection experiments are implemented on the UCF101 human action recognition dataset,and the experimental results verify that the detection accuracy and robustness of the classifier.The Context-LSTM reaches an advanced top1 accuracy on the entire validation set of UCF101,and the experimental results also verified that the model has good generalization ability.
Keywords/Search Tags:Video object detection, Deep temporal feature information, Step-type data sampler, Information transmission loss, Context-LSTM, Human action detection
PDF Full Text Request
Related items