Research On Software Defect Prediction Method Of High Speed Railway

Posted on:2021-02-03

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J H Ren

Full Text:PDF

GTID:1362330614472171

Subject:Software engineering

Abstract/Summary:

Software systems constitute an important guarantee of the safe and reliable operation of high-speed railways.Indeed,the construction and operation of high-speed railways are inseparable from the support of computer software systems.Moreover,software testing is the effective way to ensure the reliability of the software.Defect prediction of high-speed railway software can help to ensure the high quality of the software.This dissertation’s analysis of the relationship between high-speed railway safety operation and software systems focuses in-depth on software defect prediction models but also summarizes the basic theory,the main prediction algorithm,and the commonly used experimental data and evaluation indices for modeling software defect prediction.The software defect prediction model constructed based on supervised and unsupervised prediction are compared with the industry-standard algorithms to verify the effectiveness of the proposed algorithms.The proposed software defect prediction algorithms are then applied to predict the defects,utilizing the data collected from high-speed railway software developed by the network management center of Beijing Jiaotong University.Defect prediction is then realized within the high-speed railway software,and the prediction results are verified by the third-party software defect detection tools.The main research work and conclusions are as follows:(1)A project based on self-organizing data mining method is proposed,using a software defect prediction model under supervision condition to analyze high-speed railway software with defect labeled data.These predictions are used in establishing a causal model relating software measurement to software defects.In this model,software metrics are used as independent variables,while software defect factors including the software defect labeled value serve as dependent variables.The proposed model has the ability to perform classification predictions and ranking predictions of software defects.Analyses of 14 public defect datasets show the effectiveness of this algorithm compared with six other software defect classification and ranking algorithms.Finally,the algorithm is applied in testing the some high-speed railway software developed in the laboratory to predict software defects,and better results are obtained.These results are consistent with those of third-party software defect detection tools.(2)In the case of high-speed railway software projects with insufficient defect data,the traditional cross-project defect prediction algorithm usually selects only the source project sample data similar to the target project data to build the prediction model,incurring loss of useful information.A software defect prediction model based on self-organizing data mining is proposed that takes full account of all instance samples and software metrics information from the source project.Before establishing the prediction model,this chapter first detect the source project instances showing high correlation coefficients with the target project,and add weight to these instances.With the weights of the remaining instances remaining unchanged,the reweighted instances are then combined into a new source project dataset.Then a model is built to predict the target project defects.Analyses of ten public defect datasets show the effectiveness of this algorithm compared with other 11 cross-project defect classification and ranking prediction algorithms.Finally,the algorithm is applied in testing the some high-speed railway software developed in the laboratory to predict software defects.Five defective modules in six projects are all predicted accurately.These results are consistent with those of third-party software defect detection tools.(3)A method of unsupervised software defect prediction based on a power-law function is proposed to determine how to divide the height element value in the absence of defect label data for high-speed railway software.This algorithm aims at the “two eight distribution” characteristic of software defect data class imbalance,combining that distribution with a power-law function characteristic of predict software defection.First,the power-law function of each metric element in the software system is established.Then,the maximum curvature of the power-law function is taken as the threshold point for determining the higher metric.Finally,the statistical information of the higher measure is utilized to classify each instance as defective or defect-free.Analyses of 12 public defect datasets show the effectiveness of the algorithm compared with four other unsupervised software defect prediction algorithms.At the same time,the complexity of the prediction model constructed by the algorithm is lower than that of other algorithms.Finally,the algorithm is applied to high-speed railway software to complete the unsupervised software defect prediction of the high-speed railway software defect data under given value distribution characteristics.Three defective modules in six projects are predicted accurately.(4)An unsupervised software defect prediction model based on data grouping by objective clustering is proposed in order to transcend the need in traditional clustering algorithms to artificially set the number of clusters.This algorithm has the advantage of spontaneous classification and does not require manual setting of the number of clusters.For this algorithm,the distance matrix of software defect data set instances is constructed by calculating the pairwise distances between samples,and all instances are divided into pairs of dipoles based on the principle of minimum distance.The nearest instances in each dipole are then grouped into one group according to the principle of minimum distance.Next,the optimal clustering number is determined according to the improved consistency criterion,and finally the sample statistical information for each cluster determines its class: defective or defect-free.Analyses on 16 public defect datasets are carried out;their results are compared with those of five other unsupervised clustering algorithms.The results show that the proposed algorithm is effective while maintaining low complexity.Finally,the algorithm is applied to high-speed railway software,completing the unsupervised software defect prediction of the high-speed railway software under the automatic clustering algorithm.Three defective modules in six projects are predicted accurately.In this dissertation,two kinds of methods are studied: supervised and unsupervised.Three algorithms are used to build the defect prediction model: self-organizing data mining,power-law function,and objective clustering.In conclusion,through research on defect prediction of high-speed railway software,the effectiveness,feasibility and accuracy of the proposed software defect prediction algorithms are verified.

Keywords/Search Tags:

High-speed railway, Software defect prediction, Self-organizing data mining, Power-law function, Objective clustering

Related items

1	Research On Data Mining Of Power System Operation Information
2	The Recognition And Prediction Of High-speed Railway Track Slab Based On The Temporal-spatial Data Mining Of Track Geometry
3	Application Of Data Mining Technology To The Fuzzy Control System Of CFBB
4	Research And Application Of Data Mining And Data Processing For Intelligent Power System
5	The Research Of Big Tickets Data Applications About Wuhan-Guangzhou High-speed Railway
6	Power System Equipment Defect Prediction
7	Research On Passenger Flow Analysis Of High Speed Railway Based On Big Data
8	Data Analysis And Application Research On Text Defects Of High Voltage Relay Protection In A Regional Large Power Grid
9	Research On Passenger Flow Analysis Of High Speed Railway Based On Big-data
10	Research On Evaluation Model Of Railway Engineering Geological Conditions Based On Big Data Clustering Mining