Study Of Software Self-admitted Technical Debt Predictive Approach Based On LDA And Cross Oversampling

Posted on:2021-02-16

Degree:Master

Type:Thesis

Country:China

Candidate:C Huang

Full Text:PDF

GTID:2428330611996874

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the development of software engineering,software systems have become more and more complex,software self-admitted technical debt has been greatly concerned by industry and academia.The software self-admitted technical debt refers to the entire software development lifecycle,developers in order to pursue the short-term benefits of the project,may be intend to choose shortcuts to complete the code implementation as soon as possible.This compromise can lead software developers to submit imperfect,reworked code that generates errors,or is only a temporary solution.After years of research,researchers have come up with some models and algorithms for identifying software self-admitted technical debt,but some recognition patterns are extracted by hand and the class imbalance problem is not considered.In view of the above problems,and based on the impact of the class imbalance,taking the recognition effect of software self-admitted technical debt as the starting point,studies extracting distinguishing words of software self-admitted technical debt and the class imbalance problem respectively.The main contributions of this thesis include the following two aspects:1)In the software self-admitted technical debt identification model problem,the past work picked up the identification patterns of software self-recognition of technical debt by a simple manual selection,only the 62 identification patterns of software self-admitted technical debt has been picked out.In view of this problem,the LDA(Latent Dirichlet Allocation)algorithm is proposed to extract the distinguishing words that identify the software self-admitted technical debt.The results show that the LDA efficiently expands other hidden distinguishing words compared with the way of manual extraction.2)In a traditional binary-class or multi-class imbalance classification problem,the prediction results of traditional classifiers tend to favour the majority category,which leads to the poor prediction effect of minority class.In view of this problem,the method of cross-oversampling is proposed in this thesis.By dismantling samples in minority class,this algorithm constructs a certain proportion of virtual samples to increase the number of samples in minority class,thus effectively extending the data of software self-admitted technical debt,and using feature selection to construct multiple classifiers to identify self-admitted technical debt.The experimental results indicate that in comparison with priormethods,this algorithm not only has the ability to expand the identification patterns of software self-admitted technical debt,but also improves the identification performance to a certain extent.Through the LDA and cross-oversampling methods,not only has the software selfadmitted technical debt distinctions been effectively expanded,but also the class imbalance has been improved,and the algorithm's recognition ability has been improved to a certain extent.

Keywords/Search Tags:

Self-admitted technical debt, Software engineering, Class imbalance, Cross oversampling, Feature selection

PDF Full Text Request

Related items

1	Research On Technical Debt Detection And Classification Methods Based On Code Comments
2	Research On Software Defect Prediction Method Based On Feature Selection
3	Research On Software Requirement Changes Technical Debt
4	Research On The Application Of Generative Adversarial Networks In Class Imbalance
5	Relationships Between Evaluation Criteria Of Feature Selection And Analysis On Class Imbalance Problem Over Vhr Remote Sensing Imagery
6	Research On Transfer-sampling Based Method For Class-imbalance Learning
7	Research On Data Preprocessing Technology In Cross Project Software Defect Prediction
8	Research On Imbalanced Datasets Classification Based On Machine Learning And Oversampling Methods
9	Research On Software Defect Prediction Method Based On Fusion Feature Selection And Ensemble Learning
10	Measuring Technical Debt Of Requirement Change By Marginal Contribution