Font Size: a A A

Group Activity Recognition With Hierarchical Deep Neural Network

Posted on:2020-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:D LiFull Text:PDF
GTID:2428330575988974Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Human action recognition has received considerable academic attention among researchers in computer vision,and it is composed of single-person action recognition and group activity recognition.Group activity recognition is on the basis of single-person action recognition,and focuses on the group of people in the scene.which facilitates lots of applications,e.g.video surveillance,sport analytics and video retrieval.In group activity recognition,the hierarchical structure between the group and individuals is significant to the recognition,and the main challenge is to build more discriminative representations of group activity based on the hierarchical structure.To overcome this difficulty,researchers have proposed numerous methods.Hierarchical framework is widely adopted to represent the relationships between individuals and their corresponding group,and has achieved promising performance.Typically,hierarchical framework based RNN has been adopted to represent the relationships between individuals and their corresponding group,and has achieved promising performance.Despite the promising performance,these methods ignored the relationships as well as interactions between individuals,and are not able to take the individuals' different contribution to the group activity into consideration,which affects the accuracy of recognition.To address this problem,we propose a novel model for group activity recognition based on the nonlocal network,and present a novel pooling scheme for individual feature aggregation,named Attentive Pooling.The proposed model utilizes a bottom-up approach to represent and recognize the individual actions and group activities in a hierarchical manner.Firstly,tracklets of multi-person are constructed based on the detection and trajectories,and static features are extracted from these tracklets by nonlocal convolutional neural network(NCNN).Inside the NCNN module,the similarity of each individuals has been calculated,in order to capture the nonlocal context within the individuals.The extracted features are then fed into the hierarchical temporal model(HTM)which is based on LSTM.The HTM is composed of individual-level LSTM and group-level LSTM,which focuses on group dynamics in a hierarchical manner.Dynamic features of individuals are extracted and features of group activities are generated by aggregating individual features with attentive pooling scheme in the HTM.Finally,the group activities and individual actions are classified by utilizing the output of the HTM.The whole framework is easily implemented in with end-to-end training style.We evaluate our model on the widely-used The Volleyball Dataset.We perform the evaluation in two different dataset settings,named fine-division and non-fine-division.Experimental results show that the proposed method can achieve 83.5% accuracy in fine-division manner and 77.6% accuracy in non-fine-division manner.And it can achieve 77.7% accuracy in integrated experiment setting,and achieve 83.4% accuracy in separation experiment setting.Examples of recognition and relationships within the group are visualized.This study proposes a novel neural network for group activity recognition and constructs a unified framework based on the NCNN,Attentive Pooling scheme and hierarchical LSTM network.We address the motivation of taking the relationships between individuals and their unequal contribution to the group activity into consideration with a nonlocal network,and utilize the contextural information existing in the group.In the process of extracting individual features,the method learns more discriminative features which combine the impact of each individuals.The experimental results confirm the effectiveness of our nonlocal model and attentive pooling scheme,which indicates that it is the contextural information between individuals as well as the hierarchical structure of the group that facilitate the group activity recognition.
Keywords/Search Tags:Group Activity Recognition, Nonlocal Network, Attention Mechanism, Feature Representation, Deep Learning
PDF Full Text Request
Related items