Research On A Code Recommendation Tool For Big Code

Posted on:2021-05-15

Degree:Master

Type:Thesis

Country:China

Candidate:S F Zhou

Full Text:PDF

GTID:2518306503464684

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Programmers may encounter endless problems during programming,at this time,some high-quality code samples which are relevant to the current code context are essential for programmers to finish tasks.However,traditional code reuse techniques such as code completion,clone detection,etc.may fail to meet this requirement.To address this problem,this paper proposed Lancer,a code recommendation tool based on big code.Lancer uses a combination of N-Gram and Transformer to understand the programming intention of the current code and predicts the follow-up code.Then,Lancer utilizes BM25 algorithm and COSEN,a semantic matching algorithm proposed in this work,to retrieve hundreds of candidate code samples.The retrieved candidates are then fed into a BERT-based deep semantic ranking model.The most relevant code samples are finally recommended to programmers thus increase their productivity.The main contributions of this work are as follows:1)using generative code matching technology instead of simple code generation or code matching.Experimental results show that Lancer achieves 0.540 of MRR,and it only takes 0.87 s in average for Lancer to accomplish a recommendation,which is far better than the compared approaches.2)present Library-Sensitive N-Gram(LS N-Gram)to better understand the current code and predict follow-up code.Experimental results show that LS N-Gram outperforms traditional N-Gram by 1.85%-4.81%.3)provide a solution to combine N-Gram and neural network-based model to further promote the code prediction performance.Results suggest that the ensemble model is competitive and the improvements are 0.40%-1.91% in MRR and HR@K.4)proposed a deep semantic embedding network named COSEN.COSEN can bring 10.5%-16.8% improvement in semantic understanding and matching.5)successfully transfer a BERT model which is pre-trained on natural language corpus to the domain of programming language.And present a deep semantic code sample ranking algorithm based on BERT model.Experimental results show that this model brings 16.8%-42.2% improvement.

Keywords/Search Tags:

Code Recommendation, Statistical Language Model, Text Matching Algorithm, Deep Learning

PDF Full Text Request

Related items

1	Personalized API Completion With Statistical Language Model
2	Research Of Document Matching Based On Deep Autoencoding Language Model
3	Text Matching Based On Deep Neural Network
4	Research On Recommendation Model And Algorithm Based On Deep Learning
5	Research And Application Of Text Matching Based On Deep Learning Learning
6	Research On Text Representation Model And Deep Learning Algorithm In Text Classification
7	Language-independent text learning with statistical n-gram language models
8	Text Matching Based On Ensemble Learning And Deep Learning
9	Research On Text Semantic Matching Based On Deep Learning
10	Research On Decoding Methods Of Statistical Language Models