| Natural language instruction analysis is an important step to make the interaction between humans and robots more freely and smoothly.In the interaction process between humans and robots,making the robot understand the intention of the instruction and the meaning of the keywords in the instruction is the basic work of natural language instruction parsing,but it is often difficult to collect natural language instruction data,which leads to the problem of the poor generalization ability of the model.In addition,natural language instruction navigation is a necessary ability for Human-Robot Interaction,and it is also the prerequisite for completing the task of fetching objects.However,due to the arbitrariness of natural language,different people have different instruction descriptions for the same task,which may cause instructions and the alignment of navigation actions during the mapping process is inconsistent.At present,for the problem of poor generalization ability,common intent-slot joint modeling methods and data enhancement methods are commonly used.These methods do not fully consider the shared knowledge and private knowledge existing between instructions of different intentions;For the problem of inconsistent mapping,a nonend-to-end statistical model is often used.This method does not consider the accumulation of errors and the impact of the robot’s environment on the mapping process.Therefore,improving the generalization performance of the model and the alignment ability in the instruction mapping process is the key to improving the human-robot interaction experience.Firstly,to address the problem of the poor generalization ability of robots,semantic feature extraction and fusion algorithm is proposed.The algorithm first uses multiple encoder structures to extract the shared and private features of different intent instructions and then uses the gate mechanism to dynamically fuse the extracted features.The whole process fully considers the correlation between instructions with different intentions,and explicitly models them,so that the semantic features in the instructions can be fully utilized.Secondly,a joint intent-slot optimization method based on graph attention network is proposed to model the correlation between two tasks.This method quantifies the relationship between the two tasks by explicitly modeling to improve the feature adequacy of the end-to-end model.Compared with the existing end-to-end intent-slot joint model,this method not only considers the guiding role of the intent information on the slot information but also considers the constraints and mutual relations between different slots in the instruction,which further improves the model pairing.The performance of explicit modeling of intents and slots also improves the interpretability of the model.Finally,aiming at the inconsistency of natural language instruction mapping,an endto-end map-instruction bidirectional attention interaction model based on neural network is proposed.The model uses map information as a knowledge base to assist the mapping in the instruction mapping process,and combines instruction features and map features through two-way attention interaction so that the model can reduce errors in the process of generating action sequences and alleviate the problem of inconsistent mapping during the process of generating action sequences. |