Font Size: a A A

Implementation Of Multimodal Active Perception Based On Humanoid Robotic Arm

Posted on:2024-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:W H WangFull Text:PDF
GTID:2568307055969789Subject:Electronic information
Abstract/Summary:PDF Full Text Request
When faced with complex scenes,humans call on multiple senses to perceive and search for target objects,and then perform various operations.Similarly for robots,when in unstructured real-world environments,it is difficult to perform complex perceptual operations by relying only on a single modality,e.g.,the inability to distinguish the softness and texture of objects under visual conditions only.In order to improve their perception capabilities,intelligent robots need to build a rich understanding of the physical world by capturing heterogeneous modal information of objects based on understanding human intentions,using multi-level perception pathways constructed by multi-source sensors(depth cameras,sound sensors,tactile sensors,etc.),and combining high-performance information processing centers and highly flexible operating systems to perform operations.Based on the "Human-Robot Intelligent Fusion Technology" of the National Key Research and Development Program of China,we propose a multimodal active sensory operation method for robots that combines speech,vision,hearing and touch.We propose a multimodal active perception operation method that combines speech,vision,hearing,and haptics to solve the problem of insufficient single-modal perception capability and achieve complementary perception.The main research work of this paper is summarized as follows:Firstly,starting from the robot’s perception operation task,we propose a multimodal active perception operation method based on a humanoid robot arm.Different from traditional robots,intelligent robots need to collect heterogeneous modal information using multi-source sensing systems and combine active perception techniques to complete the exploration,so as to achieve incremental learning of the perception task.Based on this,three multimodal joint perception models are proposed for adapting to different perception task requirements.Subsequently,the algorithmic models of each perceptual pathway are defined,and the dual-branch configuration of the tactile image recognition algorithm Dbs Mnet is proposed for the tactile perception pathway to improve the generalization ability of tactile perception.Secondly,to improve the communication and collaboration between humans and robots,we designed a collaborative human-robot interaction system based on natural language instruction understanding and created a rich set of operational language instructions for human-robot interaction,and robots need to invoke different multimodal perception models to complete operational tasks based on understanding language instructions.In addition,a friendly Human Robot Interface(HRI)is deployed on the mobile app using the Wechaty chatbot framework to achieve more convenient interaction with the robot.Thirdly,we constructed a multimodal perception dataset containing speech,vision,hearing,and touch using a multi-source sensing system of a humanoid robotic arm,and completed the analysis and processing of the dataset on the server side,verified the validity of the dataset through offline experiments,and provided a basis for the implementation of multimodal active perception technology.Finally,combining the perception task requirements and the multimodal perception data set,we constructed a hardware and software system for multimodal active perception and completed an online experimental validation of the perception task on this basis.The experimental results show that the multimodal active perception operation method proposed in this paper can adapt to the actual physical environment and ensure the stability and effectiveness of the perceptual grasping process,which provides a basis for the development of bionic and medical assisted robots.
Keywords/Search Tags:Multimodal Perception, Active Perception, Multi-source Sensing, Command Understanding, Robotic Arm Operation
PDF Full Text Request
Related items