Font Size: a A A

Research And Application Of Human Motion Prediction Based On Multi-scale Attention Mechanism

Posted on:2024-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y X LiFull Text:PDF
GTID:2568307130953049Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Human action prediction is an important problem that involves multiple fields such as human-computer interaction,game development,and sports medicine.Deep learning-based methods have been widely applied in this field,but face challenges such as data quality and model interpretability.Solving these problems can help improve the accuracy and usability of predictions,thereby promoting the development of this field.This thesis proposes corresponding methods to address the current existing problems,with the following specific work:(1)To address the problem of insufficient processing of multidimensional structural information of human actions in previous studies,this thesis proposes a feature extraction method based on a multi-scale attention mechanism(MSAM).This method uses multidimensional positional encoding technology to more accurately describe the pose information of the human body and improve data representation ability.Multi-dimensional Lie algebra and quaternion are used to represent joints,describing human motion states from both spatial and temporal scales.Multi-head attention mechanisms can better capture multiscale information in the data and extract more comprehensive and accurate features.Through experiments on public datasets,this thesis compares its method with other existing methods,and the experimental results show that the proposed method has significant advantages in human action representation and feature extraction,and can better capture the multi-scale information and details of human actions and rapid changes.(2)To address the problem of long-term dependency in human action prediction and improve the confidence of the prediction results,this thesis proposes the MS-TCN Transformer human action prediction model based on the work in(1).The action generation network of this framework consists of multiple temporal convolutional network modules and residual structures to extract features between long time steps and capture the dependency relationships between actions.The global average pooling layer in MS-TCN Transformer can aggregate and compress features,improving the model’s robustness and generalization performance.This thesis conducts experiments on different public datasets,and the data shows that compared with the latest prediction models,MS-TCN Transformer has significant improvements in short-term and long-term predictions,with more accurate prediction results and smaller average joint offset angles.(3)This thesis implements a human motion prediction prototype system based on the MS-TCN Transformer human motion prediction model.The system realizes the prediction function of human body movements and the visualization function of predicted data.Verified the feasibility and practicality of the method proposed in the thesis.
Keywords/Search Tags:Position encoding, Multi-scale, Lie algebra, Attention mechanism, Action prediction
PDF Full Text Request
Related items