Font Size: a A A

Automatic Code Summarization Algorithm Based On Gated Convolutional Neural Network

Posted on:2020-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:C R YangFull Text:PDF
GTID:2428330578951279Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
With the development of Internet technology and the rise of the open source community,the number of open source code has exploded and a lot of time and effort are demanded to explore useful information from the open source community.Automated abstraction technology is widely used to obtain the main content of text,but few researches are about code abstraction.This paper uses the deep learning methods,convolutional neural network model,to extract code features,alleviating the time and energy problems brought by information explosion.It splits the automatic code summary problem into two sub-problems:code feature extraction and summary automatic generation,and proposes a deep learning-based code automatic summary model with the end-to-end model design ideas.Specifically,for the code feature extraction,this paper uses gated convolutional neural network,which is to add positional information to an input element to obtain the position of the word in the sequence,introduce a gated linear unit for the model to select words or features that are useful for prediction,and use the residual connection for the gradient dispersion problem.This paper uses the abstract syntax tree convolutional neural network to extract the structural features of the code,in which the combination of Tree-Based CNN and Pre-Order CNN is used to obtain the complete information of the nodes in the syntax tree.Secondly,this paper studies the automatic generation of s with the LSTM method to learn the short sequence information in a summary,in which the attention mechanism is introduced and effectively solves the problem caused by the encoder that encodes all the information in the source sentence into a fixed-length vector,providing the decoder with more encoder features.The statistical machine translation model,multi-document automatic summary model and the two classic machine translation models,Seq2Seq and Convolution Seq2Seq,are used as the comparison in the experiment to verify the reliability of the code automatic summary model,proving the considerable effectivity and reliability of the code automatic summary model.
Keywords/Search Tags:Automatic code summary, End-to-end model, Convolutional neural network, LSTM
PDF Full Text Request
Related items