Font Size: a A A

Research On A Code Recommendation Tool For Big Code

Posted on:2021-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:S F ZhouFull Text:PDF
GTID:2518306503464684Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Programmers may encounter endless problems during programming,at this time,some high-quality code samples which are relevant to the current code context are essential for programmers to finish tasks.However,traditional code reuse techniques such as code completion,clone detection,etc.may fail to meet this requirement.To address this problem,this paper proposed Lancer,a code recommendation tool based on big code.Lancer uses a combination of N-Gram and Transformer to understand the programming intention of the current code and predicts the follow-up code.Then,Lancer utilizes BM25 algorithm and COSEN,a semantic matching algorithm proposed in this work,to retrieve hundreds of candidate code samples.The retrieved candidates are then fed into a BERT-based deep semantic ranking model.The most relevant code samples are finally recommended to programmers thus increase their productivity.The main contributions of this work are as follows:1)using generative code matching technology instead of simple code generation or code matching.Experimental results show that Lancer achieves 0.540 of MRR,and it only takes 0.87 s in average for Lancer to accomplish a recommendation,which is far better than the compared approaches.2)present Library-Sensitive N-Gram(LS N-Gram)to better understand the current code and predict follow-up code.Experimental results show that LS N-Gram outperforms traditional N-Gram by 1.85%-4.81%.3)provide a solution to combine N-Gram and neural network-based model to further promote the code prediction performance.Results suggest that the ensemble model is competitive and the improvements are 0.40%-1.91% in MRR and HR@K.4)proposed a deep semantic embedding network named COSEN.COSEN can bring 10.5%-16.8% improvement in semantic understanding and matching.5)successfully transfer a BERT model which is pre-trained on natural language corpus to the domain of programming language.And present a deep semantic code sample ranking algorithm based on BERT model.Experimental results show that this model brings 16.8%-42.2% improvement.
Keywords/Search Tags:Code Recommendation, Statistical Language Model, Text Matching Algorithm, Deep Learning
PDF Full Text Request
Related items