A heuristic search approach to solving the software clustering problem

Posted on:2003-04-22

Degree:Ph.D

Type:Dissertation

University:Drexel University

Candidate:Mitchell, Brian Scott

Full Text:PDF

GTID:1468390011486309

Subject:Computer Science

Abstract/Summary:

Most interesting software systems are large and complex, and as a consequence, understanding their structure is difficult. One of the reasons for this complexity is that source code contains many entities (e.g., classes, modules) that depend on each other in intricate ways (e.g., procedure calls, variable references). Additionally, once a software engineer understands a system's structure, it is difficult to preserve this understanding, because the structure tends to change during maintenance.; Research into the software clustering problem has proposed several approaches to deal with the above issue by defining techniques that partition the structure of a software system into subsystems (clusters). Subsystems are collections of source code resources that exhibit similar features, properties or behaviors. Because there are far fewer subsystems than modules, studying the subsystem structure is easier than trying to understand the system by analyzing the source code manually.; Our research addresses several aspects of the software clustering problem. Specifically, we created several heuristic search algorithms that automatically cluster the source code into subsystems. We implemented our clustering algorithms in a tool named Bunch, and conducted extensive evaluation via case studies and experiments. Bunch also includes a variety of services to integrate user knowledge into the clustering process, and to help users navigate through complex system structures manually.; Since the criteria used to decompose the structure of a software system into subsystems vary across different clustering algorithms, mechanisms that can compare different clustering results objectively are needed. To address this need we first examined two techniques that have been used to measure the similarity between system decompositions, and then created two new similarity measurements to overcome some of the problems that we discovered with the existing measurements.; Similarity measurements enable the results of clustering algorithms to be compared to each other, and preferably to be compared to an agreed upon “benchmark” standard. Since benchmark standards are not documented for most systems, we created another tool, called CRAFT, that derives a “reference decomposition” automatically by exploiting similarities in the results produced by several different clustering algorithms.

Keywords/Search Tags:

Clustering, Software, Structure, System, Source code, Several

Related items

1	Research And Implementation Of The Source Code Structure Quality Assessment Subsystem
2	Research On Software Defect Prediction Method Based On Semantic Information Of Program Source Code
3	Research And Implementation Of Source Code Based Software Maintainable Measurement System
4	Research On Source Code Analysis, Display And Application Based On SrcML
5	Research On Relationship Between Code Quality And Software Defects For Open Source Software
6	Source Certification System And Structure,
7	Design And Implementation Of Composition Analysis System For Mixed Source Software
8	Constraint Transformation And Implementation Of Complex Structure In Open Source Code
9	Exploring Action Unit Granularity of Source Code for Supporting Software Maintenanc
10	Searching, Selecting, and Synthesizing Source Code Component