Font Size: a A A

Continuous Sign Language Recognition Based On Keypoints Of Human Skeleton

Posted on:2024-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:X L XieFull Text:PDF
GTID:2568307151964599Subject:(degree of mechanical engineering)
Abstract/Summary:PDF Full Text Request
According to the seventh national population census,the number of deaf and mute individuals in China has exceeded 20 million.However,due to a lack of effective communication tools,they face communication barriers with normal individuals,making it difficult for them to participate in a wider range of social activities.To enhance their selfdevelopment ability and improve their quality of life,the Chinese government has clearly stated in the "14th Five-Year Plan" to enhance the development of the deaf and mute population.With the rapid development of deep learning technology,various sign language recognition techniques have also emerged,providing the possibility of barrier-free participation in various social activities for deaf and mute individuals.This paper proposes a continuous sign language recognition method aimed at reducing communication barriers between deaf and mute individuals and normal individuals,enabling them to better integrate into society.The main contents of this article are as follows:Firstly,based on the Media Pipe framework,a method is proposed to address the issue of data loss that occurs during its operation.This method combines the use of image enhancement and linear interpolation to improve the quality of data.With this approach,a dataset containing 9600 Chinese sign language skeletal keypoint coordinates is created.Additionally,the collected data is processed frame by frame using the maximum-minimum value normalization technique to enhance the universality of the dataset.Secondly,focusing on the sign language word action skeleton topology constructed based on the skeletal keypoint coordinates,a spatiotemporal graph neural network is applied to extract features from both spatial and temporal dimensions.The spatial attention mechanism is integrated into the network to assign weights to the skeletal points,highlighting the effective spatial features and achieving sign language word recognition.Thirdly,to address the issue of semantic alignment in continuous sign language sentence recognition,a model combining graph convolutional networks and bidirectional long short-term memory networks is proposed.This model is used to extract spatiotemporal features of sign language sentences and encode them into semantic information.Additionally,the model is trained with a joint optimization of attention mechanism and connectionist temporal classification to align the semantic sequence and decode the semantic information into Chinese.Finally,with the goal of sign language recognition in mind,a sign language recognition system is built using PyTorch,OpenCV,PyQt,Media Pipe,and other frameworks.This system is capable of real-time video recognition and local video recognition.The effectiveness of both sign language sentence recognition and sign language word recognition is validated through this system.
Keywords/Search Tags:sign language recognition, MediaPipe, spatio-temporal graph convolution networks, attention mechanism, connectionist temporal classification
PDF Full Text Request
Related items