Font Size: a A A

Generative Dialogue Model Based On Attention Mechanism

Posted on:2022-09-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z W ChenFull Text:PDF
GTID:2518306572982119Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Big data has been produced since human beings have entered the information age,especially the Internet age.At the same time,the semiconductor industry,driven by "Moore's Law," continues to produce chips with higher computing capabilities.With the help of big data and high-performance computing equipment,the capabilities of neural networks have been fully demonstrated,and artificial intelligence research has also entered a new wave.From Alex Net in 2012 to Alpha-fold in 2020,artificial intelligence with the support of neural networks has made major breakthroughs in different fields.As one of the important sub-fields of artificial intelligence,the dialogue system has also obtained new development in this wave.The traditional dialogue system is based on the pipeline method and uses retrieval generation as the response generation method,and has achieved good results in task-based dialogue.However,the dialogue in the open domain,that is,small chat,requires a lively and natural dialogue,so the relatively rigid retrieval method generated by the response is not suitable for small chat tasks.So,people migrate the sequence-to-sequence model that has achieved good results on the machine translation task to the dialogue task.This method does make the generation of replies more natural,but it has the problem of generating unreasonable replies.In order to improve the quality of response generation of sequence-to-sequence model,this thesis establishes a sequence-to-sequence model based on attention mechanism.In the process of model encoding,we use bi-directional recurrent neural unit(Bi-GRU)to process the input sentence in both directions to obtain a better representation of the sentence;in the process of generating a reply from the decoding,we use single-layer recurrent neural unit.The linear attention mechanism and the beam search algorithm are used during the decoding process.Finally,we conduct experiments on the Star War Scripts dataset to test the performance of this model.The experimental results show that the response generated by the model in this thesis is more reasonable and smoother,because the model got better scores on the perplexity index and BLEU value than those comparison models.At the same time,it also shows that the model in this thesis uses bi-directional neural units to process sentences in both directions during the encoding process,so the model's encoder captures richer structural information of the sentence,which helps to improve the quality of response generation;by using attention in the decoding process,the decoder is able to adjust the decoding operation according to the current decoding state and decode a more reasonable sentence.
Keywords/Search Tags:Dialogue system, sequence to sequence, Bi-GRU, attention mechanism
PDF Full Text Request
Related items