With the development of science and technology,surveillance cameras have become widely used in public places.With the increasing number of surveillance devices,people find it difficult to detect abnormal behaviors in real time from surveillance videos.Therefore,abnormal behavior detection in videos has been a hot research topic.To address the issues in the area of abnormal behavior detection,two methods based on the theory of weakly supervised learning are proposed,i.e.,the abnormal behavior detection method based on improved multi-instance ranking framework and the abnormal behavior detection method based on multi-instance channel attention framework.The main research work is as follows.(1)In order to solve the problem that the loss function of previous methods does not consider the continuity of abnormal clips in a video,the abnormal behavior detection method based on improved multi-instance ranking framework is proposed.Taking video frames and optical flows as input,the feature vectors of video clips are extracted.Then,the video feature vectors are fused as the input data of the network model.To alleviate the overfitting problem,the Re LU functions are used after the first and second full connection layers of the network.Aiming at the sparsity and smoothness constraints,an improved multi-instance ranking loss function is designed.In detection,the instances with the highest abnormal scores in positive packets and negative packets are calculated respectively.In the positive packets,the video clip with the highest score may contain abnormal behavior.In the negative packets,the video clip with the highest score only contains normal behavior.The experimental results on the shanghai Tech dataset show that our proposed method achieved better results than the baseline approaches.(2)To address the issue of feature rectification,the abnormal behavior detection method based on multi-instance channel attention framework is proposed,which mainly includes the network of multi-instance pseudo label generation and the network of channel self-attention.In order to rectify the extracted feature vectors,the two-dimensional channel attention module is extended to threedimensional space.After calculating the feature vectors of the fifth layer of the I3 D module in this framework,the feature vectors are input into the three-dimensional channel attention module,thus reducing the useless information.Then,based on the attention mechanism,the output feature vectors are fused with the feature vectors of the fourth layer of the I3 D module.After using the network of multi-instance pseudo label generation to calculate pseudo labels,the model of feature encoder is trained through the mechanism of supervised learning.In the testing stage,after the model of feature encoder is used to extract feature vectors,the multi-layer perceptron is utilized to calculate the abnormal scores of video clips.On the shanghai Tech dataset,the proposed method obtains better experimental results than the baseline methods. |