Research On First-view Video Action Recognition Technology Based On Multi-feature Fusion

Posted on:2021-09-27

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Jiang

Full Text:PDF

GTID:2518306512479054

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the emergence of cost-effective intelligent cameras and the rapid development of video social platforms,videos which are recorded from the first-person perspective are constantly flooding into people's lives in recent years.The research in egocentric vision has enormous potential applications,and the first-person action recognition serves as the cornerstone of video analysis among them,which has received increasing attention from the academic and community.However,the exploration of the field of action recognition is still at the preliminary stage in egocentric videos,and only a few theoretical studies focus on it currently.It is significantly different from the third-person videos in terms of visual content,and heterogeneous in nature.In this paper,the main works are summarized as follows:Firstly,a cross-feature fusion architecture is designed for egocentric interactive scenario.In this architecture,global-local branches are utilized to model the motion of diverse participants,and each branch deploys multimodal multi-stream C3 D networks to extract complementary spatiotemporal representations.The cross fusion is leveraged to eliminate redundancy and establish effective linkages between the two branches,which leads to a significant improvement in the accuracy of first-person interaction recognition.Secondly,a two-stream attention3 D feature fusion network is proposed for egocentric daily activities scenario.In the network,the 3D attention module is applied for feature maps to suppress noise in spatiotemporal clues,while the modal attention module is applied for feature vectors to explore the importance of each modality.Ablation experiments are implemented to reveal the effectiveness of the designed modules,which shows that the proposed algorithm is capable of acquiring more discriminative feature representations.Finally,a first-person action recognition system is designed and implemented.The system encapsulates multi-features fusion algorithms,which enables users to configure data models and perform feature fusion in an interactive manner.Additionally,the intermediate results are fed back to the user interface to show the efficiency and recognition performance of the algorithm intuitively.

Keywords/Search Tags:

Egocentric videos, action recognition, multi-modalities, multi-features fusion

PDF Full Text Request

Related items

1	Research On Multi-modal Human Action Recognition Based On Features Fusion And Attention Mechanisms
2	Human Action Recognition Via Fusing Multi-model Features From RGB-D Videos
3	Human Action Recognition Based On Convolutional Neural Networks
4	The Data Mining Research On Multi-Viewed Videos Of One Action For View-Independent Descriptors
5	A Human Action Recognition Method Based On Multi-features Fusion
6	Action Recognition Via Multi-features Fusion
7	Research On Human Action Recognition And Detection Based On Multilayer Visual Features
8	Research On Human Action Recognition Based On Computer Vision
9	Research Of Action Recognition From Videos Using Deep Neural Networks
10	Research On Action Recognition Algorithm Based On Spatiotemporal Modeling And Its Application