Font Size: a A A

Research On Key Class Identification For Software Projects Based On Complex Network Theory

Posted on:2022-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:P TaoFull Text:PDF
GTID:2480306773496504Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
As the functions in the software system are constantly changed,the scale of the software system will continue to increase,and its structure will become more and more complex.In the software maintenance phase,if software maintainers want to refine or modify an unknown software system,they must know the software system first.The information obtained by reverse engineering is too complex,and most of the time,software systems do not have timely and accurate documentation to help maintainers understand.Therefore,program understanding will consume a lot of effort for maintainers.This thesis aims to analyze and filter software information to provide software maintainers with the most essential information in the system.In object-oriented software systems,mining the essential information of the system can be regarded as identifying key classes in the system.Based on complex network,this thesis proposes a new model to identify key nodes in software networks,and these key nodes are the key classes of software projects.At the same time,this thesis also proposes using the voting algorithm to calculate the key classes of software projects,and the performance of the key class identification approach proposed in this thesis is compared with other 8 baseline approaches to verify the effectiveness of the new model.The main research contents of this thesis are as follows:1.A new model is proposed to calculate the importance of classes in software projects.First,the source files of the software project are parsed,analyze and calculate the parsed class information and class dependency information to obtain a software dependency dataset.Then,a corresponding weighted software dependency network is constructed based on the software dependency dataset,the nodes in the network represent the classes in the software project,the edges between the nodes represent the dependencies between the classes,and the weights on the edges represent the strength of the dependencies between the classes.Finally,the algorithm KCI(Key Class Identification)is used to calculate the importance of the class nodes in the software dependency network,and the class importance scores are sorted in descending order.The top-ranked classes are the key classes identified by the new model.2.A voting algorithm is proposed to vote out key classes in software projects,and the performance of the KCI approach is compared with 8 other baseline approaches.This thesis conducts experiments on 25 different software projects and jointly votes the top 15 and top 30 key classes for each software project as the key classes for software projects using the KCI approach and 8 other baseline approaches.Then the recall and precision of the KCI approach with 8 other baseline approaches are compared to verify the effectiveness of the new model.The experimental comparison results show that the KCI approach has the highest recall and accuracy on 23 software projects.For the top 15 key classes,it is only inferior to the performance of the In-Degree approach on the software project Pool2.For the top 30 key classes,it is only inferior to the performance of Page Rank approach on the software project Ormlite core.The comprehensive results show that the new model can be more accurately identify key classes in software projects.In addition to this,this thesis also compared the CPU time of the KCI approach on software projects of different scales.The longest running time of the KCI approach on 25 software projects is no more than 1.4 seconds,which shows that the new model can be applied to software projects of different scales.
Keywords/Search Tags:Object-Oriented Software, Software Dependency Network, Key Class, Voting Algorithm, Complex Network
PDF Full Text Request
Related items