Font Size: a A A

Improving requirements clustering in an interactive and dynamic environment

Posted on:2010-02-06Degree:Ph.DType:Dissertation
University:DePaul University, School of Computer Science, Telecommunications, and Information SystemsCandidate:Duan, ChuanFull Text:PDF
GTID:1448390002985133Subject:Computer Science
Abstract/Summary:
Large scale software systems challenge almost every activity in the software development life-cycle, including tasks related to eliciting, analyzing, and specifying requirements. Fortunately many of these complexities can be addressed through clustering the requirements in order to create abstractions that are meaningful to human stakeholders. For example, the requirements elicitation process can be supported through dynamically clustering incoming stakeholders' requests into themes. Cross-cutting concerns, which have a significant impact on the architectural design, can be identified through the use of fuzzy clustering techniques and metrics designed to detect when a theme cross-cuts the dominant decomposition of the system. Traceability techniques, required in critical software projects by many regulatory bodies, can be automated and enhanced by the use of cluster-based information retrieval methods. Domain analysis in product line development can be partially automated through clustering features to identify shared and variable components.;Unfortunately, despite a significant body of work describing document clustering techniques, there is almost no prior work which systematically evaluates the challenges, constraints, and nuances of requirements clustering. As a result, developers and researchers select requirements clustering algorithms with little understanding of their appropriateness for the task, and the effectiveness of software engineering tools and processes that depend on requirements clustering is severely limited.;This research directly addresses the problem of improving requirements clustering. The main contribution includes the extensive study of existing clustering methods and their combinations, as well as development of new algorithms that are able to deliver high quality clusters adaptively in various application contexts. Among a set of popular basic hierarchical and partitioning algorithms, the two-stage spherical K-means was identified as the most effective one. Indicated by further experiments with a number of enhancement clustering methods, consensus clustering, a hybrid clustering approach that combines the knowledge from multiple clusterings, was found to be effective and robust in clustering requirements.;Two new approaches are proposed to address unique domain characteristics. The first approach takes advantage of the human-centric environment of many software engineering activities, including online requirements gathering forums or automated traceability tools, in which users provide feedback about the quality of each cluster. To utilize this feedback, a consensus-based constrained clustering approach is proposed and empirically shown to be effective in choosing informative feedback and generating improved clusterings. The second domain problem addresses the need to balance change versus stability in the incremental clustering process. Clusters change as requirements evolve and users' feedback is elicited, yet users desire stability. For example, when clustering is used to automatically organize requirements in a visible context such as a feature gathering forum or an automated trace tool, then the set of feature requests is dynamic and clustering is incremental, yet stakeholders do not want clusters to constantly change. This is particularly evident in a web-based forum, where the generated clusters are used to anchor discussion threads and users do now want their posts to move significantly between multiple threads. A seed-preserving clustering method was therefore designed to maintain stability without sacrificing clustering quality.;This work developed a deeper insight into effective requirements clustering techniques and specific enhancements that deliver higher quality clusters. These findings are anticipated to be useful for supporting requirements related activities.
Keywords/Search Tags:Requirements, Clustering, Clusters, Software, Quality
Related items