Generative Adversarial Network Based Human Target Understanding And Analysis

Posted on:2021-01-02

Degree:Master

Type:Thesis

Country:China

Candidate:J L Tang

Full Text:PDF

GTID:2428330614968289

Subject:Information and Communication Engineering

Abstract/Summary:

Understanding and analyzing human targets is a core function of the intelligent surveillance video(image)processing systems,which has urgent practical needs and broad application prospects in security and other fields.Meanwhile,it is also one of the most popular research directions in computer vision.Aiming at human targets in videos or images,this thesis conducts research on the tasks of crowd counting and human action prediction in computer vison,from the perspectives of the entire crowd and the individual person,respectively.The main work and contributions of this thesis are as follows:1.In general,this thesis uniformly applies the generative adversarial network as the overall framework to tackle the problem of high-quality image generation which is involved in the tasks of both crowd counting and human action prediction.Specifically,based on the generative adversarial network,this thesis designs the model structures corresponding to different task requirements to generate the density maps with sharp details and the predicted frames with realistic appearance,respectively.2.For the task of crowd counting,this thesis proposes a generative adversarial network based method for generating high-quality crowd density maps with precise spatial location of the crowd.Specifically,relying on the feature pyramid network backbone,this thesis designs a generator model to fuse low-level features with rich spatial location information about the crowd and high-level features with abundant semantic information about the crowd sufficiently via lateral connections between the bottom-up and top-down pathway of the feature pyramid network,equipping the model with both semantic and spatial perception of the crowd.Besides,this thesis further introduces the spatial and channel attention mechanisms to achieve selective feature extraction and feature fusion.3.For the task of human action prediction,this thesis proposes a generative adversarial network based method for appearance-preserving human action prediction under the guidance of human pose.First,this thesis utilizes a LSTM based pose prediction network to predict the coordinates of human landmarks in future frames.Then,relying on the human pose information represented by the coordinates of human landmarks,this thesis designs a single-frame generator which combines a global generative adversarial network and a local generative adversarial network to capture different levels of appearance features and reconstruct human appearance in a coarse-to-fine,global-to-local way,generating the single frame with high quality.Finally,this thesis designs a 3D encoder-decoder architecture based video refinement network to further improve the whole quality of the generated videos.

Keywords/Search Tags:

Human Target Understanding and Analysis, Crowd Counting, Human Action Prediction, Generative Adversarial Network

Related items

1	Video-based Human Action Recognition And Prediction
2	Crowd Counting Analysis Under Complicated Scenes
3	Research And Application Of Crowd Counting Methods Based On Two Models
4	Crowd Counting Based On Adaptive Map Refinement
5	Online Human Action Analysis Based On Deep Learning
6	Research Of Crowd Density Estimation Based On Generative Adversarial Network
7	Research On Video-based Human Action Recognition And Prediction
8	Multi-Scale Crowd Counting Under Complex Scenes Based On Generative Adversarial Networks
9	Target Trajectory Prediction And Intention Understanding In Adversarial Environment
10	Research On High-Density Crowd Counting Algorithm Based On Deep Learning