Automatic Code Review Methods Based On Semantic Representation

Posted on:2024-03-10

Degree:Master

Type:Thesis

Country:China

Candidate:B T Wu

Full Text:PDF

GTID:2568306941464614

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Code review means that team members review each other’s codes during the software development process to control code quality.It could identify potential errors early in the software lifecycle and promote mutual understanding among team members.As software grows,the human resource consumption of large-scale code reviews cannot be ignored,and therefore automatic code review technique is needed to predict the result of code review.Automatic code review is such a binary classification task.In this thesis,based on the semantic code representation,we propose the following three automatic code review methods as follows:i.To solve the problem of use traditional abstract syntax tree features in code review,this thesis proposes an automatic code review method based on an optimized abstract syntax tree.The method simplifies the abstract syntax tree by selectively removing redundant nodes to improve the code representation capability of the abstract syntax tree,and then uses recurrent gate networks and convolutional neural networks to learn the semantic and structural information further,thereby improving the prediction performance of code review.Experimental results conducted on 9 open-source projects in this thesis show that the performance of the approach outperforms existing methods in all metrics.ii.To solve the problem of low utilization of structural information by code review models,this thesis proposes an automatic code review method based on graph neural networks.The method generates a node-relationship graph based on an optimized abstract syntax tree,and combines semantic information obtained from a two-way recurrent gate network to fully capture structural information between nodes by passing information among them through a graph convolutional network.Experimental results conducted on nine open-source projects in this thesis show that the performance of the automatic code review method based on graph convolutional networks outperforms existing methods in all metrics.iii.To solve the problem that additional multi-source information has not been effectively utilized in code review,this thesis proposes an automated code review method based on contrastive learning with multi-source features.The method introduces annotation information from code developers when they submit their code,making the result prediction more reliable by exploiting the information from the sources.In addition,the method uses the idea of contrastive learning to compare and learn from different samples in the same batch,making the automatic code review accurate.The results of experiments conducted on 11 open-source projects in this thesis show that the automatic code review method based on contrastive learning and multi-source features outperforms existing methods in all metrics.The work in this thesis effectively solves the problems in automatic code review,such as the feature utilization,model information extraction and multi-source information utilization in code review.It promotes the development of the field of automatic code review.

Keywords/Search Tags:

automatic code review, code representation, graph convolutional network, contrastive learning

PDF Full Text Request

Related items

1	Code Vulnerability Detection Approaches Based On Graph Contrastive Learning
2	Automatic Grading And Feedback On Student Code
3	User-Item Graph Representation Learning Based Recommendation Models With Review Semantics
4	Semantic Graph Based Multi-language Code Search
5	Research On Automatic Code Review Methods Based On Machine Learning
6	Research On CNN Based Malicious Code Classification And Detection Technology
7	Code Search Method Based On Graph Representation Learning
8	Research On Spam Review Detection Based On Heterogeneous Graph Representation Of Multiple Relationships
9	Research On Hierarchical Contrastive Learning Based Source Code Representation
10	Object-oriented Software Code Structure Analysis And Knowledge Graph Construction