Font Size: a A A

Research On Multi-source Face Tracking And Aggregation Risk Warning Based On Deep Learning

Posted on:2024-08-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:G Y RenFull Text:PDF
GTID:1528307301973949Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
Aiming at the challenges of incomplete target information acquisition and narrow acquisition range due to poor viewing angle,target occlusion,and intentional target avoidance that may occur when a single camera captures a face image,this paper proposes a multi-camera multi-view joint tracking method.This method aims to solve the problems of incomplete face information acquisition and target loss due to target occlusion and boundary crossing.Through multi-camera collaboration,this tracking algorithm can track the target trajectory more stably and comprehensively,and realize the functions of target face matching,action linkage and crowd data screening.Compared with the traditional single-camera small-scene surveillance,the wide-area,large-scope video surveillance composed of multiple cameras is gradually taking its place.Although research in this area still faces multiple challenges,however,this paper proposes a solution to deal with multi-source face tracking and crowd abnormal gathering warning.(1)In response to the problem of limited video surveillance effectiveness and range,joint recognition by multiple intelligent surveillance has become a research hotspot.In order to solve the problem of multi-camera face tracking under different camera fields of view,this study is dedicated to accurately tracking the trajectory of the target person in a wide range of videos.In this paper,image retrieval technology based on deep learning is used to mine the target that is most similar to the target person’s face from massive image data from multiple sources.The application of this technique improves the retrieval efficiency of massive data and quickly establishes the identity of the tracked target.Experimental analysis shows that the four-branch twin network model in this paper has higher accuracy and better system robustness.By sharing face features between cameras in real time,this paper realizes cross-camera face tracking and successfully tracks the face targets of pedestrians passing through the area by comparing the similarity of face features within the same video surveillance field of view.The retrieval model in this paper significantly improves the accuracy of face tracking,and on the LFW dataset,the face tracking accuracy can be improved to 99.51%,and the average face retrieval accuracy reaches 98.74% with a frame rate of 75 FPS.(2)In real production safety scenarios,due to the problem of undesigned target multifeature correlation or weak correlation,as well as the influence of environment and illumination resulting in pedestrian behavior not being able to correlate with face information,certain violations or dangerous actions may occur,which may lead to the loss of life and property.To solve this problem,this paper proposes a method based on real-time 3D human action recognition,which aims to solve the dangerous action recognition of targets in multi-source cameras and realize the goal of identity linkage and tracking.Experimental analysis is conducted in this paper,and the results show that 3D2 Streams,a real-time detection network based on 2D to 3D skeleton,is able to complete the 3D estimation and transformation of the key points of the 2D skeleton,and at the same time,it realizes the fusion of the 2D and 3D skeleton features for the task of human 3D pose recognition.On the Human3.6M dataset as well as the NTU RGB+D 60 multi-view enhanced dataset based on Eulerian transformation,the 3D skeleton action recognition in this paper achieves 88.2% cross-object accuracy and 95.6% crossview accuracy.This method significantly improves the prediction accuracy of 3D skeleton action recognition in real-time surveillance.(3)Aiming at the problem of serious target occlusion and potential danger of stampede accidents caused by overcrowding,this paper proposes a real-time crowding level warning and key small target face detection method for dense crowds.The method is able to provide timely crowding level warning and area evacuation hints to prevent potentially dangerous situations when the number of crowds reaches the safety limit.Particularly importantly,the method is able to send out warning signals for emergency escape or gathering risk when key targets of the crowd are targeted.In this paper,an experimental analysis is conducted to apply the DRCConv LSTM network in crowded crowd videos.The network combines a depth-aware model and a depth-adaptive Gaussian kernel to extract spatio-temporal features and depth-level matching information of crowd depth-space edge constraints from the video,which significantly improves the effectiveness of crowding detection and counting in dense crowds.In this paper,the crowd counting is evaluated and validated on five datasets,including MICC,Crowd Flow,FDST,Mall,and UCSD,which achieve better counting performance and verify the effectiveness of the method.Further ablation experiments confirm the validity of each part of the model parameters.Ultimately,this paper successfully integrates key technologies such as face tracking,3D skeleton real-time motion recognition,and dense crowd counting into a surveillance system for industrial scenes.Using machine vision technology,this system is able to recognize the identity of pedestrians intruding in dangerous areas in industrial scenes,and detect their various kinds of abnormal actions,and then send out alarm signals.In addition,the system is also able to detect abnormal crowd gathering phenomena due to emergencies in other industrial scenes and issue crowd congestion warnings in time to prevent potentially dangerous events from occurring.The system in this paper adopts a variety of deep learning models,and through cross-data set scenario sample training,it significantly improves the precision of intelligent analysis of various types of violations in industrial scenarios,and improves the accuracy of the alarm prompts.
Keywords/Search Tags:Multi-source face tracking, Multi-source face retrieval, 3D skeleton action recognition, Dense crowd count, Congestion warning
PDF Full Text Request
Related items