Font Size: a A A

Research On Odor Detection Method Of ECC Multi-tag Code Based On Sorting Loss

Posted on:2022-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:J N WangFull Text:PDF
GTID:2518306476496194Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Code smell is a deep-level software quality problem introduced by developers’ bad programming habits or violation of design principles.It is a manifestation of bad code or design.In an actual software system,a class or method may have multiple code smells at the same time,and some code smells have a certain degree of relevance,and the probability of frequent occurrence together is greater.The interaction between them reduces the readability and comprehensibility of the code,and increases the complexity and maintainability of the software system,resulting in a significant decrease in software quality.In addition,code smells can appear in many forms.For example,method-level code smells often cause class-level code smells.This order of appearance has a great effect on suggesting refactoring strategies.The multi-label classification task is suitable for detecting the above-mentioned situation where there is at least one kind of code smell.This method can treat the relevant code smell pair as a label group,and better consider the dependency relationship between them.However,the existing multi-label code smell detection method does not consider the appearance order of multiple code smells in the same label group.Therefore,this paper combines the correlation and appearance order of code smells,and proposes an ensemble of classifier chains(ECC)multi-label code smell detection method based on ranking loss.The main work content is as follows:(1)Code smell correlation: Mining code smell correlation with association rules and factor analysis,and constructing the obtained related code smell pairs into a multi-label dataset(MLD);(2)Algorithm performance comparison: Comparing code smell detection results of logistic regression(LR),support vector machine(SVM),C4.5,random forest(RF),and extreme gradient boosting(XGBoost),and determining the applicability classifiers for code smell detection;(3)Code smell detection sequence: Considering the influence of multiple smell detection sequences in the same code component,and simulating the generation mechanism of code smell.The ranking loss is taken as the loss function of multi-label classification,with its minimization as the goal,the optimal label sequence set is selected to detect whether the above-mentioned related code smell pairs exist in the same code element.The experiment in this paper is based on 9 code smells of 4 Java open source systems.Firstly,through verification experiments,it is found that the fixed label order can not effectively consider the correlation between code smells;Then,the detection results of the tree classifiers under the framework of the multi-label method proposed in this paper are compared,and the effect of XGBoost is found to be better.Furthermore,under the same classifier,the detection effect of this method and the other two multi-label methods classifier chains(Cl C)and ECC are compared and analyzed.The results show that the multi-label method in this paper is superior to the existing multi-label methods;Finally,the detection results of each code smell under the single-label classification and the multi-label method of this paper are analyzed,which further confirms the effectiveness of this method.
Keywords/Search Tags:code smell, association rules, factor analysis, ranking loss, ECC
PDF Full Text Request
Related items