Font Size: a A A

A New Hybrid Deep Learning Model To Improve Human-object Interaction Detection In Images

Posted on:2022-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y J SunFull Text:PDF
GTID:2518306536454764Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Human-object interaction detection is one of the most popular research directions in the field of artificial intelligence.At the same time,the detection of human-object interaction activities is the basis of research on image understanding and automatic description of image content.In practical applications,it can be widely used for website image search,security system detection,etc.This paper proposes a new hybrid deep learning model,which aims to enhance and improve the efficiency of human-object interactive activity detection in images.The model first designs a deep learning method to model the three-dimensional spatial relationship information of the image,and explores the composition pattern of the interaction between humans and objects;Second,we innovatively design a hybrid deep learning model that integrates the features of the objects in the image and the three-dimensional spatial relationship information between objects into the deep network architecture,and establishes an integrated inferencing learning process from feature learning to interactive activity detection,making the most of the spatial composition information in the image to improve the efficiency of human-object interactive detection in images.The contributions of this paper can be distilled as three-folds as follows:(1)This paper innovatively proposes a multi-granularity spatial relationship analysis model of cube based on the method of the graph.The model discretely abstracts the three-dimensional spatial relationship between people and objects into a cube structure,and analyzes the three-dimensional spatial relationship of multi-granularity.The model realizes the detailed analysis of the spatial relationship of images,and summarizes the rules of spatial composition of images,which assists the detection of the main person and object in the image as well as the detection of human-object interaction.(2)In order to detect the main interactive subjects in the image intelligently--the main people and objects in the image,and provide reliable research targets for the detection of human-object interactive activities subsequently.This paper integrates a variety of machine learning algorithms,and analyzes the three-dimensional spatial relationship information between people and objects intensively.We conclude the spatial pattern of the composition of people and objects,establishes a prediction model from the spatial relationship to the combination of people and objects,and infers the main people and objects in the image.(3)This paper innovatively proposes a factor-based hybrid deep learning model.The model integrates various image features and three-dimensional spatial relationship information into the image interactive activity prediction through multiple multiplication and connection mechanisms,and realizes the transformation from image spatial structure to image advanced non-linear mapping of content understanding.The model first uses a multi-layer restricted boltzmann machine to extract the features of people and objects respectively;and then uses a factor-based multiplying connection mechanism to integrate image features,three-dimensional spatial relationships,and spatial composition rules into the deep network architecture.Therefore,the model can maximize the use of various image spatial composition information to guide the deep learning framework to learn the most useful advanced invariant features for interactive detection,and essentially improve the efficiency of human-object interactive detection.In order to verify the performance of the hybrid deep learning model in the recognition of human-object interaction activities,this paper employs different comparison methods to verify the effectiveness of the proposed model on two datasets.
Keywords/Search Tags:human-object interaction recognition, three-dimensional spatial relationship information, hybrid deep learning, major human-object group
PDF Full Text Request
Related items