Font Size: a A A

Research On Code Recommendation And Comment Generation With Context Information

Posted on:2021-12-17Degree:MasterType:Thesis
Country:ChinaCandidate:X YanFull Text:PDF
GTID:2518306479960879Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the process of software development,developers often reuse code to improve the efficiency of software development.Existing researches usually leverage Information Retrieval technologies to implement code recommendation,aiming to carry out code reuse.There exists mismatch between the high-level intent in natural language queries and the low-level implementation details for these traditional approaches.In this thesis,we propose Deep CR which is a code snippets recommendation approach based on Sequence to Sequence model.Deep CR leverages abstract syntax tree analysis to extract code context information and then make data preprocessing to build a high quality of data set by heuristics.Then,Deep CR takes Sequence to Sequence model to train a query generation model which can generate queries for code snippets.Code snippets recommendation is then implemented by calculating the similarity between generated queries and natural language queries from developers.The data in our code repository origins from Stack Overflow website to ensure the reality of the collected data.The effectiveness of Deep CR is evaluated by calculating MRR and Hit@K scores of code snippets recommendation results.The experimental results represent that Deep CR are superior to existing approaches and can improve the performance of code snippets recommendation effectively.The improvement of development efficiency depends on both code recommendation and code comprehension.Specifically,comments are crucial to code comprehension.A code sequence without corresponding comment may cause extra understanding burden,which can reduce the efficiency of the software development and maintenance.In this thesis,we propose a novel approach Context CC to automatically generate concise comments for Java methods based on neural networks,leveraging techniques of program analysis and natural language processing.Firstly,Context CC employs abstract syntax tree parsing to extract context information.Secondly,it filters code and comments out of the context information to build up a high-quality data set based on a set of pre-defined templates and rules.Finally,Context CC trains a code comment generation model based on recurrent neural networks.Experiments are conducted on Java projects crawled from Git Hub.We show empirically that the BLEU-4 score is about 40.52% and the performance of Context CC is superior to state-of-the-art baseline methods.
Keywords/Search Tags:Abstract Syntax Tree, Program Static Analysis, Sequence to Sequence Model, Code Snippets Recommendation, Code Context Information, Code Comment Generation
PDF Full Text Request
Related items