Font Size: a A A

Temporal Action Proporal And Detection Based On Dynamic Actionness Score Network And Adaptive Complementary Structure

Posted on:2020-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2428330590984521Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Human action recognition has been a sharp focus in computer vision with many real-world applications,the rapid development of the Internet technology promotes the advancement of its models and algorithms.Driven by large amount of video data,especially the accuracy of the action recognition of trimmed video is significantly improved.However,the majority of videos in real applications are continuous and untrimmed,which contain multiple objects and multiple action instances.Therefore,researchers start to study the techniques for action recognition in untrimmed videos,which motivated a new challenging task-temporal action detection.This task aims to solve two problems:(1)identifying temporal boundaries(the start time and the end time)of each action instance-temporal action proposal.(2)recognizing what class the action instance is.Three new network structures are proposed to improve the performance of temporal action proposal and detection.The main work are as follows:1.A dynamic proposal network is proposed,which is called temporal dynamic pooling(TDP)network.The existing action proposal generation methods based on actionness score often lack of the temporal information from continuous video frames,which leads to the accuracy of proposal and detection is not high enough.In this dissertation,a new action proposal generation method is proposed.We designe the TDP network based on a multi-layer perceptron,which iteratively calculates the actionness score of each frame in video.the TDP network uses the residual of continuous multi-frame feature vector and dynamic pooled feature vector as an input,which has utilized the temporal information from continuous frames.Meanwhile,the action classifier is trained according to the key frames extracted by the actionness score,which reduces the redundancy calculation of action recognition.Experimental results show that compared with the existing best performance of actionness scores based: TAG,the TDP network in the average recall rate(AR@100)is improved by 11.2%,and the accuracy of action detection(mAP)is increased by 3.8% on the THUMOS14 dataset.2.A proposal evaluation network(PEN)is proposed,the candidate proposal generated by TDP network are evaluated and post-processed to suppress redundant action proposal.Proposal evaluation network based on multi-layer perceptron is designed to calculate the confidence score of the candidate proposal,and the redundant proposal is removed by the Soft-NMS postprocessing method according to the confidence score,which can improve the average recall rate of the proposal.The experimental results show that the average recall rate(AR@100)for the action proposal of the TDP network after the use of PEN increased by 6.04%.3.A method for temporal action proporal based on adaptive complementary structure(ACS)is proposed.Proposals generated by actionness score could be more precise but less stable,but proposals generated by sliding windows are more stable but less precise,which means that the actionness scores based and the sliding windows based have the complementary attributes.We design an actionness scores trustworthiness(AST)network,which is obtained through proposal evaluation network by training actionness score proposals.The AST network calculates the actionness scores trustworthiness of proposals generated by sliding windows,which can indicate whether this proposal can be correctly detected by the actionness scores.The ACS network adaptively selects sliding windows to fill the omitted ones in actionness proposals caused by low quality of actionness score.Afterwards,the temporal convolution boundary regression network is used to adjust temporal boundaries.Experimental results show that compared with the sliding windows and TDP network,the ACS network in the average recall rate(AR@100)is improved by 11.2%,8.26% on the THUMOS14 dataset.
Keywords/Search Tags:temporal action proporal and detection, dynamic pooling, multi-layer perceptron, Soft-NMS proposal suppression, adaptively complementary structure
PDF Full Text Request
Related items