Font Size: a A A

Human Interaction Detection Based On Multimodal Fusion

Posted on:2022-12-03Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2518306755995779Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Human-Object Interaction(HOI),as an advanced task of machine vision,is crucial for machines to understand the world more deeply.Human interaction detection not only needs to locate and identify people and objects in the scene,but more importantly,infer the interaction between people and objects.The research of human interaction detection is of great significance to many fields such as security systems and video retrieval.The human interaction detection method proposed in this thesis is based on the fusion of three modalities: pictures,3D human body topology,and object semantic information.The main work of this thesis is as follows:(1)For human behavior discrimination,supplementing human pose features has become an important means to improve the performance of HOI recognition,which provides more detailed human information for interaction relationship inference.However,the 2D pose feature still lacks human surface information and cannot capture the topological features that highlight these connections.In this thesis,the interaction behavior is considered to be the synergistic effect of the intuitive three-dimensional spatial position of the human-object and the implied topological structure change,which has certain integrity and flexible connectivity.The construction of three-dimensional human continuous surface information and topological connection relationship can improve the accuracy of HOI.important.This thesis proposes a HOI identification method based on human 3D mesh topology enhancement,focusing on the3 D mesh model of the human body as the input of the mesh neural network,and extracting the mesh edge features that can emphasize the interaction relationship from the bottom up.The information can characterize the invariance of the main topological features in the interaction relationship and enhance the performance of interaction recognitions.(2)For character interaction discrimination,the relative spatial path of characters assumes that the same interaction action corresponds to different objects with similar spatial characteristics.However,in natural scenes,the shared spatial interaction detection module ignores the semantic differences between object instances of different classes corresponding to the same interaction behavior.Aiming at this problem,this thesis improves the semantic richness of spatial features by enhancing visual-spatial features through object semantic information.The first step is to propose a spatial interaction detection module of external attention based on the semantic information of objects.By learning the potential relationships in the samples,the same objects in different samples can have similar representations.The second step uses the semantic information of objects to combine samples to alleviate the problem of spatial relationship ambiguity by expanding the training samples.In order to further narrow the spatial feature encoding of similar semantic objects and distance the spatial feature encoding of different semantic objects,the thesis adopts the method of spatial feature alignment to adjust learning.
Keywords/Search Tags:Human-Object Interaction, Multimodal Fusion, 3D Mesh Topology, Geometric Deep Learming, Semantic Information
PDF Full Text Request
Related items