Font Size: a A A

Behavior Understanding Based On Human-Object Interaction Detection

Posted on:2021-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y B WangFull Text:PDF
GTID:2428330605969600Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Human behavior understanding is one of the main functions of the intelligent space,and it is also the premise and foundation for the independent service of the robot.At present,most of the research on the human behavior understanding are based on detecting human-object interaction.Detecting human-object interaction can be achieved by combining object detection with action recognition in a simple laboratory environment.However,general object detection and action recognition methods cannot be applied to the task of detecting human-object interaction in complex home environments.To solve this problem,this paper conducts research on human-object interaction detection in complex environments,and proposes interactive object detection method and multi-category action recognition method that are suitable for human-object interaction detection.Finally we realize human behavior understanding in complex environments based on human-object interaction detection.Considering that human is the core of human-object interaction,we propose a pose estimation model with diverse receptive fields to predict the joint coordinates of the human body in the picture.In order to solve the problem of diversity of human joint scales in images,this paper designs a multi-scale receptive field feature extraction module,and applies this module to the input layer and bridge connection layer of the model and finally extracts multi-scale features from the image.To solve the problem of joints occlusion,this paper designs the joint vector map and contextual feature extraction module,and optimizes the prediction of obscured joints by using the structural relationship between human joints.We conduct experiments on two pose estimation datasets:LSP and MPII.The recognition accuracy on the two datasets are 86.2%and 83.9%.These results show that our model can effectively predict the coordinates of human joints.This paper proposes an interactive object detection method based on the object detection model YOLOv3.We propose a heatmap of interactive objects.The heatmap can be used to further identify the objects interacting with people and the actions they belong to from the results of object detection.In order to predict the heatmap of interactive objects,we modify the structure of YOLOv3 and use the human joints guidance mechanism during the network training stage.The network first predicts the human joints and then gradually predicts the heatmap of the interactive object.Finally,after the interactive objects of each action are determined according to the heatmap of interactive objects,the semantic information of each interactive object(tool or target object)is determined by Bayesian estimation.We test the method on the V-COCO dataset.The results show that this method can perform better interactive object detection than existing methods.Proposing a multi-class action recognition method for human-object interaction detection.First,according to the coordinates of the joints,we use the knowledge of anthropometry to estimate the length of each limb of the human body,determine the standard skeleton model under the defined coordinate system,and determine the affine transformation parameters as human action features.Then,we determine the interactive object features using the interactive object heatmap and the feature maps extracted by the convolutional neural network.Finally,the two sets of features are fused to realize multi-class action recognition.In order to normalize and store the results of human behavior understanding,this paper investigates how to quantify the results of behavior understanding,and proposes a ternary description vector to express the results of human behavior understanding.This paper fuses the results of each stage to generate a ternary description vector of the human in the image,and designs a storage and interaction program for user.User can consult the results of behavior understanding and send the results to other tasks through the interaction program.
Keywords/Search Tags:human behavior understanding, human-object interaction detection, pose estimation
PDF Full Text Request
Related items