Font Size: a A A

CBR-based software quality models and quality of data

Posted on:2006-08-18Degree:Ph.DType:Dissertation
University:Florida Atlantic UniversityCandidate:Xiao, YudongFull Text:PDF
GTID:1458390005996012Subject:Computer Science
Abstract/Summary:
The performance accuracy of software quality estimation models is influenced by several factors, including the following two important factors: performance of the prediction algorithm and the quality of data. This dissertation addresses these two factors, and consists of two components: (1) a proposed genetic algorithm (GA) based optimization of software quality models for accuracy enhancement, and (2) a proposed partitioning- and rule-based filter (PRBF) for noise detection toward improvement of data quality.; We construct a generalized framework of our embedded GA-optimizer, and instantiate the GA-optimizer for three optimization problems in software quality engineering: parameter optimization for case-based reasoning (CBR) models; module rank optimization for module-order modeling (MOM); and structural optimization for our multi-strategy classification modeling approach, denoted RB2CBL. Empirical case studies using software measurement data from real-world software systems were performed for the optimization problems. The GA-optimization approaches improved software quality prediction accuracy, highlighting the practical benefits of using GA for solving optimization problems in software engineering.; The proposed noise detection approach, PRBF, was empirically evaluated using data categorized into two classes. Empirical studies on artificially corrupted datasets and datasets with known (natural) noise demonstrated that PRBF can effectively detect both artificial and natural noise. The proposed filter is a stable and robust technique, and always provided optimal or near-optimal noise detection results. In addition, it is applicable on datasets with nominal and numerical attributes, as well as those with missing values. The PRBF technique supports two methods of noise detection: class noise detection and cost-sensitive noise detection. The former is an easy-to-use method and does not need parameter settings, while the latter is suited for applications where each class has a specific misclassification cost. PRBF can also be used iteratively to investigate the two general types of data noise: attribute and class noise.
Keywords/Search Tags:Software quality, Data, Models, Noise, PRBF
Related items