Font Size: a A A

Research On Code Summary Model Of Graph2Seq Based On Attention Mechanism

Posted on:2020-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:J W GaoFull Text:PDF
GTID:2428330572996539Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Code summary is a natural language description of code functions.High-quality summary can help understand code and maintain software.Code summary has always been an important part of software development.However,in software engineering,the lack of code summary will leads to the work of maintaining code more difficult,which has been a problem that has long plagued the industry.Therefore,it is of great research significance and application value to study how to automatically generate code summary from code.In the study of code summary,we mainly solve two problems:Problem I,code is a strongly structured language,which is very different from weakly structured natural language.How to make full use of the structured semantic of code information is the key issue in the code summary task.Problem 2,variable names,method names and other words in code are open-vocabulary words,and the conventional natural language processing method(e.g.,using<UNK>token to replace low-frequency words)will lead to serious out-of-vocabulary problem.In this paper,we study the related work on natural language processing and code summary,and propose a Graph2Seq code summary model based on attention mechanism.The main contribution of our work are:1,in order to overcome the existing problems of the code summary model based on abstract syntax tree traversal,this paper proposes a Tree2Seq code summary improvement model based on the attention mechanism.2,on this basis,this paper refers to the related research on code and graph,adds semantic information such as data stream informtion on the abstract syntax tree of code,further expands the code AST into the structure of Code Graph,and then proposes the Graph2Seq code summary model based on the attention mechanism..3,for open vocabulary words,we use the Byte-Pair-Encoding algorithm to split the words into sub-words in the code summary task and then add the sub-words to the Code Graph to solve the open vocabulary problem.
Keywords/Search Tags:Code Summary, Deep Learning, Tree2Seq, Code Graph, Graph2Seq
PDF Full Text Request
Related items