With the rapid development of economy,technology and medical level,the aging of the social population is becoming more and more serious,and the health problems of the elderly group are also prominent.Falls have become one of the main causes of accidental injury and death among the elderly.If falls can be detected in time,it can provide valuable time for rescue.Based on computer vision technology,this thesis starts from two aspects of human body pose modeling and action sequence modeling,fully excavates the human body posture information and action sequence features,and improves the accuracy of fall detection.First,in terms of pose modeling,existing fall detection methods generally use human skeletons to represent pose.Most of these methods use feature engineering to extract the pose features of the human body based on prior information,ignoring the information interaction between key points in the skeleton.In this thesis,the human skeleton is represented as an undirected graph data structure,and a graph convolution pose classification model is designed to classify different poses.The graph convolution is applied to capture the relationship between different key points in the skeleton,and the human pose information is fully extracted.Secondly,the existing pose estimation methods suffer from missing key points,which will cause the extracted human pose to be incomplete and affect the detection accuracy.Based on the visual analysis of the pose estimation model PIFPAF,this thesis designs a pose localization network and a pose classification network,which can use the output features of the backbone model to extract a more complete human pose,and alleviate the problem of incomplete human pose caused by the loss of key points.The proposed pose localization network can complete the localization and extraction of human pose features with only one forward inference,and efficiently utilize the network parameters.Finally,in the task of fall detection,most methods employ recurrent neural networks such as LSTM,GRU to model temporal information.However,recurrent neural networks cannot perform parallel computations,leading to high computational cost,and has limited ability to deal with long-distance dependencies.In order to enhance the model’s ability to understand the action temporal process,this thesis introduces the attention mechanism into sequence modeling of pose features,and learns the correlation between features at different times through adaptive calculation of similarity,making full use of the action sequence information.In order to solve the problem of lack of fall pose data,this thesis constructs a VLG-POSE pose classification dataset to train and evaluate pose modeling methods.The proposed methods are experimentally validated and evaluated on the pose classification dataset VLG-POSE,the object detection dataset COCO and PASCAL VOC,and the fall dataset Le2i-FDD and URFD.The experimental results show that the proposed graph convolutional network and pose localization network fully extract the information of the human pose,improves the performance of the fall detection system,outperforming the traditional spatial pose modeling methods.The proposed temporal modeling method based on attention mechanism can effectively utilize the temporal information of falling actions,performing better than the methods based on recurrent neural networks.The experiments fully verify the effectiveness of the spatial pose modeling and temporal modeling methods proposed in this thesis for the task of fall detection. |