Font Size: a A A

Research On Change Barrier Code Smell Detection Based On Logistic Regression

Posted on:2021-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:T T LvFull Text:PDF
GTID:2428330611451412Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Software development techniques are increasingly iterative.In the process of software development,it is easy to introduce code smells due to a series of reasons such as the software development cycle,the complexity of projects and so on.Many researches have shown that code smells can reduce the code comprehensibility and maintainability,making programs error-prone and causing deep design problems.Hence,code smell detection is extremely important during software development and maintenance.Up to now,many excellent technical methods and means have been used to detect code smells.Machine learning-based code smell detection technology has become a mainstream method in recent years.However,there are two limitations of existing machine learning methods:(1)most articles only focus on common smells,and(2)the proposed metrics are ineffective when being used for uncommon code smell detection,e.g.,change barrier based code smells.In order to overcome these limitations,this paper focuses on three uncommon change barrier code smells,including Parallel Inheritance,Shotgun Surgery and Divergent Change.It is the first to systematically study the recognition and prediction of change barriers in the field of machine learning and analyze its domain-specific metrics.Change barrier code smells mean that a change in code affects many classes.The detection of change barriers can help improve the quality of the software code and enhance the readability of the code.It also reduce the workload of later maintenance and improve work efficiency.Besides,it can help developers find design problems that may cause adverse effects in future maintenance work.This paper proposes a change barrier detection algorithm based on Logistic Rregression,which predicts code smells through data extraction,data preprocessing,data storage,and model construction and prediction.First,the domain-specific metrics related to the characteristics of code smell are extracted,and then perform data preprocessing on these features.It uses the processed metrics to build a Logistic Rregression model and optimize the model parameters.Then,the best parameter configuration model is used to train data.Finally,predict the code to be detected with the trained model,and the feedback result of smell detection is obtained.The experimental verification shows that the precision and recall of this method is as high as 81.8-100%,which outperforms existing algorithms by 10%-30%.Besides,this paper also analyzes the similarity and importance of the utilized metrics.It isfound that the domain-specific metrics are highly important for the detection of change barriers,which can make the Logistic Regression model achieve better performance and then verify the key role of the metrics for smell identification.The results of this research would help practitioners better design detection tools for such code smells.
Keywords/Search Tags:Code Smells, Change Barrier, Logical Regression, Machine Learning, Software Development
PDF Full Text Request
Related items