| Human behavior recognition based on video is a research hotspot in the field of computer vision and has been widely used in many fields such as smart home,health monitoring and public security.However,the behavior recognition algorithms that can be applied to video data in different scenarios are different.Therefore,this paper mainly improves the spatio-temporal local interest point algorithm and the long-time circular convolutional neural network algorithm for the different characteristics of video data.The specific research contents are as follows:(1)For the simple small sample data,spatio-temporal interest point algorithm has a high recognition rate and simple calculation.The interest point feature only contains local information and it is easy to lose the adjacent information,so we propose a human behavior recognition method which integrates global feature and local feature.First,the moving target in the video is detected and edge direction histogram(EOH)is extracted as the global feature.Then,multi-scale optical flow histogram(MHOF)is constructed as local feature according to the interest points of moving target.Finally,the feature layer concatenation strategy is adopted to fuse the two features and behavior classification is completed by SVM.The algorithm is implemented on the platform of Opencv and Matlab,and the WEIZMANN and KTH behavior datasets are used for experiments.The experimental results show that the algorithm improves the accuracy of human behavior recognition,and the fusion feature has better robustness and sample discrimination ability.(2)For the complex multi-sample data,long-term recurrent convolutional network algorithm extracts the spatial features by CNN,and learns the time information of behavior by LSTM.In order to further exploit the long-term dependence of video sequences,a human behavior recognition model based on parallel cross convolution network(PCCNN)and bi-directional long short-term memory network(Bi-LSTM)is proposed.First,PCCNN extracts two sets of convolution features and implements feature cross in the fully connected layer,which enhances the feature robustness.Then,the crossed features are sequentially transmitted to the two layers of Bi-LSTM and the bi-directional time information of the video sequences is captured.Finally,the behavior classification is jointly predicted by the two directional features in the softmax layer.The algorithm is implemented on the Keras platform with Tensorflow as the back end,and experiments are carried out using the UCF101 and HDMB51 large datasets.The experimental results show that the deep combination model improves the discriminative ability of the network,and the two stream network structure is superior in behavior recognition. |