Font Size: a A A

Automatic Generation Of Code Comments Combining Tree2Seq And Attention Mechanism

Posted on:2022-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:L L ZhaoFull Text:PDF
GTID:2518306485462344Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the natural language description of the source code,code comments is the explanation and description of the source code.As an important part of software,code comments plays an important role in software maintenance,reuse and other fields.It can effectively assist developers to understand code and improve the efficiency of software development.However,in the actual development,the code is faced with problems such as insufficient comments and mismatching between comments and code,which leads to developers spending a lot of time in understanding the code.Automatic code comments generation technology aims to reduce the workload of manual comments,assist developers to understand the code,and then optimize the process of software development,which plays an important role in improving the efficiency of software development.This thesis summarizes the current situation of automatic code comments generation methods,analyzes the differences between program language and natural language,and further considers how to effectively apply the machine translation model to the task of automatic code comments generation in combination with the related technology of machine translation in natural language processing.(1)Automatically generate comments for software code based on the Tree2Seq model.The key to automatic code comment generation is how to obtain the information of the code comprehensively,and the main difference between the code and natural language is the rich structure information of the code.Abstract syntax tree of the code contains the structure information of the code,so we use the abstract syntax tree to obtain the code information and complete the task of automatic generation of code comments.The classic machine translation model,Seq2 Seq,traverses the abstract syntax tree of the code into sequences,resulting in loss of structural information.To overcome this drawback,this thesis uses the Tree2Seq model,which is an improvement based on the classic Seq2 Seq.The model encoder uses Tree-LSTM to encode the abstract syntax tree,which can effectively retain the structural information of the code,thus improving the quality of comments generation.(2)The attention mechanism is integrated into the Tree2Seq model to improve the accuracy of comments generation.Since the encoder will encode the input information into a fixed length vector,the long information encoded by a single Tree2Seq will lead to the loss of preorder information,and ultimately affect the quality of comments generation.The attention mechanism can overcome this disadvantage and avoid the problem of gradient disappearance of neural network.Tree2Seq,which incorporates an attention mechanism,selects the most relevant part of the current information each time it is decoded,allowing the model to accurately retrieve the code information.(3)Generate corresponding comments for the software code of multiple languages,and analyze the experimental results from multiple perspectives.Experiments are carried out on Java and Python datasets,and four models,Seq2 Seq,Seq2Seq+Attention,Tree2Seq and Tree2Seq+Attention,are compared on comments automatic generation task,which confirms the advantages of the model proposed in this thesis.Furthermore,BLEU,Rouge and Meteor,which are commonly used in machine translation,are used for automatic evaluation,and some experimental data are selected for manual evaluation to verify the effectiveness of the model.The experimental results show that the model combining Tree2Seq and the attention mechanism achieves a BLEU-4 score of 39.8 on Java data and 38.2 on Python data.At the same time,the accuracy rate of manual evaluation reached about 70%.
Keywords/Search Tags:Code comments generation, Deep learning, Tree2Seq, Attention mechanism, Abstract syntax tree
PDF Full Text Request
Related items