Font Size: a A A

Inconsistency Detection For Java Code And Comment Based On Data Flow Analysis And Text Analysis

Posted on:2022-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y TaoFull Text:PDF
GTID:2518306725981389Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As a critical component of a software system,comment plays a essential role in program understanding and software maintenance.However,there are always inconsistency between code and comment in the process of software development,which increases the cost of program understanding,and greatly reduces the software maintainability.Most of the existing work uses program analysis and text analysis technology to detect the parameter constraints inconsistency between code and comment.However,program analysis that based on AST has the problem of missing control flow and data flow information,while text analysis technology has the problem of low recognition rate of parameter constraints;in addition,the existing tools do not consider the detection of content inconsistency between code and comment.This paper proposes a method to detect the inconsistency between code and comment based on data flow analysis and text analysis technology.This method can not only detect the parameter constraints inconsistency between code and comment in Java method,but also detect the content inconsistency between code and comment.The main work of the paper includes:(1)A method of parameter constraints inconsistency detection between code and comment is proposed.We implement the corresponding tool DCCI based on this method.First,through static data flow analysis of Java bytecode file,four kinds of parameter constraints are extracted,including: nullness not allowed,nullness allowed,range limitation and type restriction.By constructing a high-precision method call graph to analyze the parameter constraint transfer caused by method call,and finally integrate the parameter constraints in the code;then,in the text analysis stage,based on the dependency syntactic parsing and heuristic rules,use text analysis and extracts four kinds of parameter constraints;finally the inconsistency between the logical expression of parameter constraints in code and comment is determined,so as to detect the parameter constraint inconsistency between code and comment.In order to evaluate the effectiveness of the method of parameter constraint inconsistency detection in code and comment,DCCI have tested 2412 inconsistent problems on seven Java projects which are widely used.Among them,2030 are the real inconsistency problems,accounting for 84.16%.But only 2183 inconsistencies are detected by the existing tool,of which1747 are the real inconsistency problems,accounting for 80.02%.(2)A method for detecting the content inconsistency between code and comment is proposed,including code and comment preprocessing,word embedding,Siamese network model based on Bi-LSTM and attention mechanism,and content inconsistency determination.First,java code is parsed into an abstract syntax tree ast by using static program analysis technology,and the content of related nodes is located and extracted by traversing ast;then the pre trained word embedding model is used to map the code token sequence and comment token sequence into the same semantic space,which is represented as a fixed length vector;and then,Siamese network model uses two submodules with identical structure to process code and comment.The similarity between code and comment is measured by Manhattan distance.Finally,the content inconsistency between code and comment is detected by setting threshold.In order to evaluate the effectiveness of the inconsistency detection method between code description and comment description,300 samples(about 10%)were randomly selected as the test set and the rest as the training set.Our method obtained 89.42%accuracy in the test set,which was improved compared with the existing method of SVM(82.46%).
Keywords/Search Tags:data flow analysis, text analysis, code-comment inconsistency detection
PDF Full Text Request
Related items