| Dialogue system is one of the most promising applications in natural language processing tasks,and multi-round dialogue is a relatively complex work.The so-called multi-round dialogue aims to carry out continuous dialogue with the purpose of solving a specific task according to the context information.In today’s fast-paced era,people often conduct online consultation in order to save time,so it is necessary to build a multi-round dialogue model for intelligent medical care.For dialog systems,most of them directly adopt the structure of Transformer and GPT-2.Although the effect has been improved,there are also some defects.Therefore,in order to optimize the basic module of Transformer and avoid using the form of history splicing for the multi-round dialogue model,this paper improves the existing Transformer layer normalization,and then adopts different information interaction methods to design the model.The main work of this paper includes:First,this paper proposes a basic component,Con-LN Transformer Layer,which improves the normalization of the Transformer layer.That is,the layer normalization is placed in the residual block in the attention module,and the layer normalization is placed outside the residual block in the feedforward neural network module.The rationality of the improvement is briefly proved in theory.This adjustment makes the gradients of different layers closer to each other,making the gradient perform well during initialization.The warm-up stage of learning rate can be canceled,and the expected gradient of parameters near the output layer will not be large.This paper compares Transformer,CPT-2,[→D]and[E→D],and verifies that this adjustment is effective and feasible for the multi-round dialogue system in this paper.Secondly,this article refers to existing model design schemes to construct a multi-round dialogue model called MTrm for integrating historical information.An improved Transformer Block based on Con-LN Transformer Layer utilizes a splice key value attention mechanism that increases historical information to assist in the transmission of information between Encoder,Decoder,and Mccoder.Among them,design the input sources and information transmission methods of the newly added Mccoder,and construct a multi-round dialogue model MTrm.As the interaction intermediary between Encoder and Decoder,Mccoder becomes the core of MTrm’s processing of contextual information.At the same time,these three parts adopt the same improved Transformer Block structure to achieve parameter sharing and fast convergence of parameters.Thirdly,this paper analyzes the experimental results in depth.The experiment was evaluated using the experimental evaluation indicators PPL,BLEU-2 and F1.Experiments have verified the effectiveness and feasibility of MTrm.At the same time,through comparative tests,it can be seen that the overall effect of the model built with the improved layer normalization--Con-LN as the basic component is good,and the adjustment of the layer normalization position is effective and feasible.The model that Mccoder provides both Encoder and Decoder with historical information is generally superior.The MTrm model with relatively good effect,namely E(?) M→ D,M(?)D,is used to realize multiple rounds of intelligent medical dialogue.The overall response of the model is more topical,which indicates that the effect of the model is better. |