Research On Svm-based Evaluation Method Of Clone Code Harmfulness

Posted on:2014-03-06

Degree:Master

Type:Thesis

Country:China

Candidate:Z C Li

Full Text:PDF

GTID:2268330422950628

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Clone code (also known as duplicated code) has always been a popular researchfield of software engineering. With the limitations of attention on single versionsoftware, traditional view is that clone code is harmful to the program and should bepromptly detected and refactored. However, a few studies in recent years found thatcloned code is not necessarily harmful-in the process of code evolution, some clonecodes exist for a very short life cycle, while others which never changed in the systemhas strong robustness, and it is unwise to refactor them blindly. Therefore, consideringthe stability of the system, maintenance costs, refactoring difficulty and other factors, itis very necessary to comprehensively evaluate the harmfulness of clone code.Unfortunately, there are very few such studies so far.To solve the problem of clone code harmfulness evaluation, we propose a SVMbased method with both static metrics and evolution metrics to evaluation theharmfulness of clone code. Referring to the research method of software defectprediction, our method regards the evaluation of the clone code harmfulness as asupervised classification problem in machine learning. First, we proposed a standarddefinition of clone code harmfulness and corresponding sample labeling method.Second, referring to correlation research on software defects prediction and clone codeevolution, we proposed two software metrics-static metrics and evolution metrics tocharacterize the information of clone code, and chose the SVM as the core algorithm ofour evaluation model. Finally, with this model we train and test on the clone codesamples with both static metrics and evolution metrics. After cross-validation andparameter optimization, the work of our clone code harmfulness evaluation model hascompleted.In the last section of this paper, we experiment on6different types of open-sourcesoftware system written in3kinds of programming languages to verify and assess ourmethod.The experiment results show that the proposed method of this paper has betterapplicability and higher accuracy. It is a meaningful attempt to evaluate the harmfulnessof clone code. In addition, the analysis on the influence of the proposed metrics alongwith our experiment results also provides a valuable reference for the future study.

Keywords/Search Tags:

software evolution, software metrics, code clone, harmfulness evaluation, support vector machine

PDF Full Text Request

Related items

1	Research On Software Clone Genealogies Construction And Evolution Features Extraction
2	Research On Harmfulness Prediction Of Clone Code Based On Bayesian Network
3	Research On Code Clone Extension Analysis And Management Technology
4	Research On Predicting Harmfulness Of Code Clones Based On The Topic Model
5	Research On Analysis And Consistency Maintenance Of Code Clone Based On Software Evolution
6	Research On The Methods Of Java Code Clone Detecting
7	Code Complexity Based Software Evolution Evaluation And Analysis
8	Research On Code Clone Detection And Clone Bug Finding
9	Clone Code Harmfulness Prediction Research Of Unbalanced Classification And Feature Selection Problem
10	Identify And Recommend Refactoring Clones Using Software Evolution History