The Tibetan Plateau is one of the famous tourist destinations in China,attracting a large number of tourists every year.With the increase in the number of visitors,ensuring the safety of people in scenic areas is particularly important.Although manual supervision and monitoring videos can reduce such problems,it requires huge labor costs.Therefore,using computer vision technology to intelligently analyze surveillance videos has become the key to solving this problem.This thesis aims to solve the task of abnormal behavior detection in high-altitude tourism scenic area surveillance videos.Based on the constructed dataset for highaltitude tourism scenic areas,different video abnormal behavior detection algorithms are designed according to its characteristics.The efficiency and accuracy of abnormal behavior detection in monitoring videos are improved to ensure visitor safety within scenic areas.The main research contents are as follows:1.In response to the lack of public datasets for high-altitude tourism scenic areas,a dataset was created through shooting,screening,frame cutting and labeling processes.Compared with public datasets in anomaly detection fields,this dataset has more complex backgrounds,dense foreground targets and insufficient motion information between video sequences which pose greater challenges for anomaly behavior detection tasks.2.To address issues related to abnormality detection in surveillance videos at high altitude tourism sites,an autoencoder-based feature storage backbone network was proposed which reduces convolutional neural networks’ learning generalization ability.Reducing model’s ability towards effective reconstruction from anomalous samples leads to decreasing model’s detecting capability.Two algorithms were designed based on this backbone network.One algorithm is based on video frame reconstruction using intensity loss and gradient loss as constraint conditions during training.Another algorithm is based on video frame prediction adopting similar model structure as generative adversarial networks where autoencoder serves as generator while discriminator using Markovian Discriminator approach.Both algorithms use motion loss besides intensity loss and gradient loss using motion information between frames sequence effectively during inference phase.Both algorithms can determine abnormal behavior based on reconstruction error between input and output.Both algorithms perform well on public datasets,but their performance was not ideal on the high-altitude tourism scenic area dataset where frame prediction algorithm had 4.8% higher accuracy than frame reconstruction algorithm.3.Testing and analysis of the high-altitude tourism scenic area dataset revealed that its complex background,dense foreground targets and insufficient motion information between video sequences;relying solely on appearance-based errors as a discrimination criterion would render anomaly detection algorithms ineffective.To address this issue,a video abnormal behavior detection algorithm based on mixed multi-input feature clustering was designed which preprocesses data using object detection network.The algorithms could reduce impact of complex backgrounds and dense foreground targets while improving motion information between video sequences by utilizing RGB_diff representation of foreground target’s movement across frames sequence intervals.This algorithm improved accuracy compared to both frame reconstruction and frame prediction algorithms by 22.9% and 18.1%,respectively.The thesis validates its effectiveness in detecting abnormal behaviors in surveillance videos at high altitude tourism sites. |