Human Interaction Recognition Based On The Fusion Of RGB And Skeleton Data

Posted on:2020-08-10

Degree:Master

Type:Thesis

Country:China

Candidate:L L Qin

Full Text:PDF

GTID:2428330605980534

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Human interaction recognition based on video is a hot issue in the field of image processing and computer vision.Due to the lack of depth information in the action recognition of RGB video,the accuracy and real-time performance of the recognition results do not meet the practical requirements of relevant industries.In recent years,Microsoft's Kinect device can directly acquire skeleton data with depth information,which provides effective information supplement for RGB video.At present,the deep learning network can directly extract the deep information in the image,which greatly improves the accuracy of action recognition.Therefore,this paper research the convolutional neural network structure model based on the fusion of RGB and skeleton data.First of all,the skeleton data needs to be coded and imaged before it can be combined with CNN for recognition.However,the current coding process does not consider the relationship between the spatial position of joint points and the interaction of two people very well.Therefore,the distance feature of joint points is introduced to code the skeleton data.Then the coded image is sent to CNN to learn the deep features and to realize the recognition of two-person interaction.The algorithm has strong operability and good real-time performance.Secondly,the loss of spatial and temporal information is serious in the existing research on the process of joint information coding process.Therefore,this paper proposes an innovative coding method of joint motion map that contains temporal and spatial information,and combines with CNN network to get good recognition results.This method doesn't need complex preprocessing process for joint points and achieve real-time processing.Finally,according to the respective characteristics of RGB video and joint point data,a CNN structure model is proposed which based on RGB and joint point data dual-stream information fusion.The former uses keyframes to obtain spatiotemporal images,the latter is code to construct joint point motion images.Two kinds of feature maps are sent to CNN network to get recognition score,and finally is achieved by fusing the recognition scores of two feature maps.The method is simple and easy to implement with good mobility.A two-person interactive behavior recognition framework based on multi-source information fusion is successfully established.

Keywords/Search Tags:

Human Interaction Recognition, RGB Video, Skeleton Data, Deep Learning, Information Fusion

PDF Full Text Request

Related items

1	Action Recognition And Real-time Interaction Technology Based On Skeleton Information
2	Research On Methods Of Deep Feature Modeling And Action Recognition Based On Human Skeleton Sequences
3	Action Recognition Based On Depth Image And Skeleton Data
4	Human Skeleton Action Recognition Based On Deep Learning
5	Research On Key Technologies And System Implementation Of Multimodal Human-computer Interaction And Dialogue
6	Monocular Vision Based 3D Reconstruction And Human Action Recognition From Skeleton Data
7	Research On Human Action Recognition Algorithm Based On Depth And Skeleton Information
8	Human Skeleton-based Action Recognition Based On Deep Learning
9	A New Hybrid Deep Learning Model To Improve Human-object Interaction Detection In Images
10	Research On The Application Of Face Recognition And Expression Recognition In Human-computer Interaction