Falling is a frequent and very dangerous behavior among the elderly.The elderly who lacks care may miss the treatment and endanger their lives if they are not found in time.How to ensure that the elderly can be detected in time after a fall has become a research hotspot.At present,the most researched detection methods mainly include: let the elderly carry wearable sensors to collect the motion information to detect falls;arrange environmental sensors indoors to collect environmental data to detect falls;use the camera to collect the image of the elderly and perform fall detection after processing.Since carrying wearable sensors will cause many inconveniences to the life of the elderly,and environmental sensors have requirements for indoor layout,the research content of this paper only uses camera sensors to collect behavioral data and perform fall detection.The traditional image processing methods perform poorly in the complex environment of the home.However,the human pose estimation method based on deep learning can not only accurately extract the foreground target of the human body,but also obtain the position of various key points of the human body,and can be well applied to fall detection.The main work includes the following 4 points:(1)Build a dataset of human behavior.The dataset was collected by a number of volunteers simulating falls and normal life behaviors in the home environment,including daily actions such as walking,jumping,lying down,and sitting down,as well as abnormal behaviors such as falling,which laid the foundation for our follow-up research.(2)Foreground extraction.Comparing the gap between traditional image processing methods and pose estimation methods,traditional image processing methods are easily affected by the environment,which makes the extracted feature data not suitable for fall detection.The human poste estimation method HRNet can obtain clearer and more obvious bone information.Therefore,the human pose estimation method is selected as our foreground target extraction method.(3)A fall detection method based on Bi-LSTM(Bi-directional Long Short-Term Memory)is proposed.Many fall detection algorithms only use the information of a single image to determine whether they are currently falling,ignoring the correlation between the previous and next frames in the time dimension.The information contained in the video is significantly richer and can show the continuity of human actions.In response to this problem,we built a Bi-LSTM network to detect falls on key points of the human in the video clips.The accuracy rates of 98.5% and 97.1% are obtained on our dataset and the public dataset URFD.(4)A fall detection method based on spatial temporal graph convolutional network is proposed.Although the Bi-LSTM network can extract contextual information in the time dimension,it does not make good use of the spatial characteristics of the human skeleton information.Therefore,we introduced graph convolutional network.In the field of action recognition,the spatial temporal graph convolutional network has good detection results.For the task of fall detection,we improved the ST-GCN by eliminating human keypoints that are irrelevant to fall detection and reducing the network structure.The accuracy rates of 99.5% and 98.6% are obtained on our dataset and the public dataset URFD. |