Double Interactive Behavior Recognition Based On RGB And Depth Information Fusion

Posted on:2020-12-26

Degree:Master

Type:Thesis

Country:China

Candidate:P Wei

Full Text:PDF

GTID:2428330590466504

Subject:Control theory and control engineering

Abstract/Summary:

PDF Full Text Request

Video-based interactive behavior recognition is an important research direction of machine vision,which has broad application prospects in intelligent security and video content retrieval.Human interaction behavior recognition based on RGB video is difficult to adapt to interference problems such as illumination or background in complex environments due to its lack of dimensional information,resulting in low accuracy for complex interactive behavior recognition.In order to make up for the shortcomings of RGB video,this paper uses RGB and depth information fusion methods to conduct in-depth research on the interaction behavior of two people:Firstly,for the lack of depth information for the two-person interactive behavior recognition based on RGB video sequences,and the inaccurate identification of complex and variable interactions,this paper proposes a two-person interactive behavior recognition algorithm that combines individual information of depth information and RGB overall information.The method uses a holistic-based method to represent the action video on the RGB video information source;on the depth video information source,the interactive double is individually segmented by the YOLO network structure,and then the segmented individual is used in the video using the visual co-occurrence matrix The traits belonging to each person's associated points of interest are sent to the classifier for classification,and finally the two information sources are fused.The algorithm is easy to implement,has high operability,and the recognition rate is greatly improved.Secondly,the two-person interactive behavior recognition algorithm is generally based on the traditional feature description of video.The algorithm has high computational complexity and relatively low recognition accuracy.A deep learning network based on RGB and deep video dual-stream fusion is proposed.The structural model uses the convolutional neural network to extract and vectorize the spatial features of the image sequence,and input the obtained vector information into the long-and short-term memory network unit for time series modeling.During the training process,the RGB video and depth video data streams are separately sent to their respective network training interaction behavior network models,and the class probability matrices obtained by the network modules are respectively sent to softmax for fusion to obtain the final recognition result.Compared with the traditional algorithm,the recognition rate of this algorithm is greatly improved.Finally,based on the above research,a RGBD double interactive behavior recognition algorithm based on convolution mechanism convolution is proposed.The algorithm uses the attention mechanism convolution to automatically extract the significant local joint features of the action subclass,and combines this feature with the long and short memory neural network to complete the feature representation and time series modeling of the video action behavior,and obtain a better recognition effect.The accuracy of the algorithm is not greatly improved compared with the recognition rate of the convolution algorithm,but its training convergence speed is fast,the oscillation amplitude is small,and it tends to be stable,which has important practical significance.

Keywords/Search Tags:

Behavior recognition, Overall information, Individual segmentation, Long-short-term memory network, Probability matrix Feature level fusion, Attention mechanism convolution

PDF Full Text Request

Related items

1	Chinese Sign Language Recognition Based On Convolutional Network And Long Short Term Memory Network
2	Research On Group Behavior Recognition Based On Multi-stream Architecture And Long Short-term Memory Network
3	Research On Audio-Video Information Processing Based On Lip-Changing
4	Research On Pedestrian Trajectory Prediction Algorihm Under Attention Mechanism
5	Research On Children’s Emotion Recognition Based On The Fusion Of Speech And Text Bimodality
6	Research On Text Classification Method Combining Attention Mechanism And Bi-GRU
7	Research On Behavior Recognition Based On Network Of CNN And LSTM
8	Research On Speech Emotion Recognition Technology Based On Context Feature Fusion
9	Application Of Short Term And Long Term Memory Neural Network In Stock Trend Prediction
10	Speaker Emotional State Recognition Based On Speech And Text Fusion