Font Size: a A A

Research On Human Posture Recognition Based On 3D Images

Posted on:2022-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:H C DuFull Text:PDF
GTID:2518306338967849Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the increasing popularity of RGB-D 3D video capture devices,research on computer vision tasks for RGB-D video has received increasing attention.Compared with traditional RGB video,the added depth information can effectively recover the longitudinal information of the scene,thus improving the accuracy of subsequent recognition tasks.In this context,this paper focuses on human pose recognition of RGB-D video,using deep neural networks for 3D joint point recognition,and using the recognized joint point data for subsequent action classification,and verifying the feasibility of the proposed algorithm in an in-car scene to combine the algorithm with practical requirements.The following are the main research contents and contributions of this paper.First,this dissertation constructs an in-vehicle RGB-D pose dataset,which uses RGB-D cameras and wearable inertial sensors to collect data.The dataset contains RGB video,depth images,and driver 3D nodal information;and proposes a spatial-temporal synchronization method to align multiple data streams in time and space using specific signal alignment and camera projection regression;achieves automatic large-scale data annotation by fitting the annotation results of specific manually annotated frames to reduce manual workload and provide reliable in-vehicle pose estimation and action recognition for the subsequent scene data for subsequent pose estimation and action recognition.Second,this dissertation designs and implements a human pose estimation algorithm model based on RGB-D data.The whole algorithm is divided into relative pose estimation and absolute pose estimation.Compared with some relative pose estimation algorithms that only use 2D pose to regress 3D pose,this paper additionally uses depth images to compensate for the lack of depth in RGB images.Firstly,the results of typical 2D pose estimation algorithm and depth image features are used as input,and the depth image features are combined with the spatial attention mechanism to extract the depth image features,and the depth features and 2D pose are used to regress the relative 3D pose of human body;the absolute pose estimation uses the depth image for feature extraction to detect the root node position of human body,and the absolute pose of human body is obtained by combining the relative pose estimation results.The performance of the algorithm is tested on the Human3.6M dataset and compared with the baseline algorithm using only 2D pose.The experimental results show that the human pose estimation algorithm proposed in this paper can effectively reduce the average joint point error and root node localization error.Third,this dissertation proposes a multi-scale co-occurrence feature action recognition algorithm based on 3D nodal point sequence information.At present,many graph convolution models for action recognition in academia focus on the learning of spatial features.The algorithm model proposed in this paper introduces multiscale time-domain features and uses multiscale time-domain convolution kernels to enhance the learning ability of the model for time-domain feature information,thus enabling the model to perform better recognition of actions of different durations;the algorithm uses the graph convolution module to learn human body topological relationship features,and uses the properties of convolution and the attention mechanism to learn the correlation of non-directly connected joints,so that the learning of spatial dimension is not The algorithm uses the graph convolution module to learn the topological relationship features of the human body and the correlation of non-directly connected joints using convolutional properties and attention mechanism,so that the learning of spatial dimension is not limited to the human body structure.The experimental results show that the model outperforms the baseline model in both NTU-RGB-D and Kinetics datasets,and can effectively identify abnormal driving behaviors in in-vehicle datasets.
Keywords/Search Tags:spatial-temporal synchronization, attentional mechanisms, graph convolution, multi-scale convolution
PDF Full Text Request
Related items