Font Size: a A A

Research On Crowd Behavior Recognition Technology Based On Video

Posted on:2022-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:X G KangFull Text:PDF
GTID:2518306317477664Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Crowded videos of the same category may contain different scenes,different numbers of people,different fields of vision,making the crowded video classification task very challenging.Some crowd behaviors with a small probability of occurrence(such as stampede,riots,etc.)can cause huge loss of property and lives easily.Existing crowded video data sets are very poorly balanced,and generally include a large number of samples of ordinary crowded videos,and very few ones with a small probability of occurrence.Crowd behavior recognition is a multi-label classification task,which has the characteristics of complex scenes and imbalanced samples.Feature representation plays an important role in classification tasks.At present,the mainstream video classification features are obtained through two streams(static stream and dynamic stream).However,in real-world observation and research,feature selection is closely related to scene information.When the background information is complex,it will be difficult to extract the most descriptive information,especially to identify the categories that co-occur in crowd video scene.The traditional method to obtain motion features is based on optical flow information.When the motion trend in the video is not obvious or is affected by the background,the optical flow will not be able to effectively describe the relevance of the motion information.In order to obtain a discriminative feature representation in crowded scenes,we adopt the approach of combining motion trend features with dynamic evolution features rather than the traditional method.Specifically,we inject the association information between the categories into the 3D dynamic information to obtain the motion trend features,and then fuse the static appearance features as the input of the LSTM network.In our work,crowd behavior is often the interaction of multiple behaviors,which may be accompanied by some unrelated behaviors.Whether it is describing the semantic relationship between behaviors or capturing the evolution trend information of related behaviors,it is very important for identifying crowd behaviors.Therefore,we also apply the LSTM network to crowd scene recognition and capture the context between video frames.Although the features based on category information have a strong descriptive ability and are of great help to the classification of crowd behaviors in videos,there are still defects.For example,the sample number of some rare categories is insufficient,and the data set will be imbalanced.The study uses LSTM network with memory function to train the model,which will require a large amount of sample to obtain good classification results.For those small sample categories in the training process,satisfactory classification results cannot be achieved even if strong feature representation.In view of this,for these small sample categories,this research proposes the idea of subspace to solve the problem of poor classification.The idea of subspace is inspired by attribute allocation.The difference is that the subspace is to find a category subspace of a certain category,and this category subspace needs to meet two conditions.The first is that the distribution of the category in the subspace is relatively simple;the second is to use the association relationship between categories.The subspace makes the categories more distinguishable,and solves simultaneously the problem that fewer samples is not easy to converge.In a word,in order to distinguish main category,we expect to generate a suitable subspace in which it can be easily distinguished by using correlation information between the main category and the other categories.The research object of this paper is crowd behavior recognition based on multi-label.For the task of multi-label recognition,the sigmoid activation function is usually applied to the output of the last layer of the network to obtain the final classification result.In the classification process,a probability value between 0 and 1 will be assigned to each category,and the categories are independent from each other.Categories with relatively sufficient samples can achieve decent performances by using the sigmoid activation function whilst categories with fewer samples obtain worse results under the same circumstance.To overcome this problem,we design subspace classifier.The classifier is utilized to optimize the current category,which is different from the globally optimized classifiers utilizing the relationship between categories.Specifically,the classifier can weaken the main category and reduce the dependence of the main category on the number of samples on the one hand,and enhance the relationship between the main category and other categories for the indirect classification of the main category on the other hand.In a nut shell,the classifier designed for categories with fewer samples can optimize the current subspace by weakening the main category and weighting the association relationship between categories.The main contributions of this study include three aspects:(1)Aiming at the problem of poor classification caused by appearance noise and dynamic noise,it is proposed to combine motion trend features with dynamic evolution features;(2)Aiming at the problem of processing imbalanced samples and multi-label tasks,a classification method of associated subspace is proposed,which makes it easier to distinguish between small sample categories and other categories;(3)Aiming at the classification problem of small sample categories,this research designed a subspace classifier,which optimizes the current subspace by weakening the main categories and weighting the correlation between the categories.The experimental results show that the performance of the algorithm proposed in this paper has reached the mainstream level on the largest crowd behavior recognition data set(WWW database).This research has important theoretical significance and broad application prospects for the application of intelligent video surveillance technology.
Keywords/Search Tags:Multi-label, subspace, imbalanced samples, associative subspace
PDF Full Text Request
Related items