Research And Implementation Of Smart Home Human-Machine Natural Interaction Method Based On Multimodal Fusion

Posted on:2023-02-24

Degree:Master

Type:Thesis

Country:China

Candidate:B L Shao

Full Text:PDF

GTID:2532307031499824

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

With the development of home intelligence,various smart devices have emerged.Traditional contact interaction methods are difficult to meet the needs of convenience,and non-contact control forms such as voice control emerge as the times require,which shift the input of the device from physical contact to language information,providing a relatively natural control method.However,whether it is contact or non-contact interaction,most of them are based on single-modal device input;When this modality is compromised,the device input becomes blurred and cannot respond properly,thus affecting the interactive experience.In view of the above problems,this paper applies multi-modal fusion to the human-computer interaction control of home equipment,integrates voice and gesture recognition,and designs a multi-modal human-computer interaction method based on noncontact interaction.This paper studies the natural human-computer interaction method of smart home with multi-modal fusion from the following three aspects:(1)Aiming at the problem of large interference in complex background gesture recognition in home environment,a dual-stream fusion framework of skeleton and depth map is proposed to realize dynamic gesture recognition.First,the BILSTM network structure is used to extract features from 2D bone information,and CNN and BILSTM are used to extract features from 2D depth maps.Then,various fusion methods are studied,concat stitching based on feature fusion,LMF low-rank weight decomposition,and fractional fusion-based maximum value fusion as well as mean fusion.The fused features contain more gesture information,and the recognition results are obtained through the classification layer.The results show that the dual-stream fusion is more conducive to improving the recognition accuracy than the single-modality method.(2)Aiming at the problem of low accuracy of speech recognition due to large noise disturbance in the home environment,an end-to-end speech recognition method based on Deep Speech2 model fine-tuning is proposed to realize the recognition of streaming speech.First,the speech features are extracted using linear spectrogram preprocessing method,then CNN and GRU are used as acoustic models to realize the mapping of speech features to phonemes.Finally,the pre-trained language model is called and the decoding results are optimized by combining with the cluster search algorithm.The results show that the accuracy of speech recognition is improved compared with the traditional method.(3)Aiming at the problem of low flexibility of single control mode in sensing command mode of household equipment,a multi-mode human-computer interaction method was proposed.First collect speech signals and gesture motion sequences,then call the corresponding trained model to identify the corresponding mode.Finally,the gesture and speech recognition are combined for the above two modalities,the method of late fusion is adopted,and the recognition results are weighted and fused,realize multiple sensing methods of the device,improve household equipment identification accuracy,so as to make the right responses.Based on the above work,a multi-mode fusion human-computer interaction system is built by using Python Web framework Flask,voice acquisition equipment is used to acquire voice signals,and depth camera is used to collect bone and depth information as the input of the above model.The human-computer interaction mode studied in this paper is oriented to the scene of smart home,integrating the voice and gesture in the non-contact interaction mode,using multi-modal mode to make up for the deficiency of single mode,so that the device can receive a variety of input modes,so as to improve the accuracy of response and improve the human-computer interaction mode of smart home.

Keywords/Search Tags:

Smart Home, Human-Machine Interaction, Gesture Recognition, Speech Recognition, Multimodal Fusion

PDF Full Text Request

Related items

1	The Application Of Speech Recognition And Gesture Recognition In Remote Sensing Image Processing And Display
2	Research On Smart Home System Based On Human Behavior Recognition Algorithm
3	Research And Application Of Speech Recognition Interaction In Smart Home
4	Simulation Of Smart Home System Based On Human Gesture Recognition
5	Video Gesture Recognition Smart Home Control System Based On FPGA
6	Design And Implementation Of Universal Voice AI Cloud Platform For Smart Home Speech Recognition
7	Research On Multimodal Natural Interaction Technology Using Projection To Indicate Wall Features
8	Research On Smart Home Design Based On Speech Recognition
9	Research On Vehicle-based HUD Human-Computer Interaction System Based On Gesture Recognition
10	Research On Gesture Recognition For Vehicle Air Conditioning Control