Font Size: a A A

Research On Source Code Alert Classification Based On Cost Sensitive Manner

Posted on:2017-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z H PanFull Text:PDF
GTID:2348330509954401Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As an auxiliary approach of software test, static analysis tools can help developers locate potential code errors on early phase of development without compiling or running the software. While studies have shown that such tools always report amounts of source code alerts, and most of them are meaningless false positives, at the same time checking each false positive alert will cause serious waste of resources. To eliminate the negative and enhance the availability of static analysis tools, researchers have classified an alert to actionable or unactionable alert using statistics and machine learning techniques. However, these classification techniques do not consider the class imbalance problem caused by false positives and the unequal costs of different misclassifications.Because of this, the cost sensitive neural networks techniques which can effectively handle the class imbalance problem, are used in our work to classify source code alert, the main works are as follows:(1)Researched existing alerts classification and actionable alert identification methods, and an automatic alert identification method has been improved and implemented, which following the evolution of software, and will automatically classify alerts to actionable and unactionable based on a series of sequential bug fix releases.(2)Using the Jira bug tracking system to collect the number of defects of each releases, the coefficients which correlating alert to defect and actionable alert to defect, have been analyzed to verify the validity of the automatic alert identification method.(3)From aspect of category, type and priority of an alert, the distributions of source code alerts and actionable alerts have been studied to find which kind of category or type of alert are more possibilities to be actionable alert to guide developers repair source code alert reasonably.(4)Finally, based on the data set collected from automatic alert identification method, after extracting the alert characteristics, the BP neural networks, and cost sensitive neural networks, based on over sampling, under sampling and threshold moving techniques, are applied to classify alerts respectively.In three open source experimental projects, the average Spearman coefficient between actionable alert and defect is 0.732, which sufficiently attest the feasibility of automatic alert identification method. As the results of alert analysis, developer should pay more attention to the alert with category like Correctness and Experimental, and the alert with the type like GC_UNRELATED_TYPES and IL_INFINITE_LOOP, which are more likely to be an actionable alert. At last the alert classification results show that, comparing with BP neural networks, the cost sensitive neural networks techniques increase 44.07% of actionable alert recall rate averagely. And when the cost of misclassification of an actionable alert is above a certain value, cost sensitive techniques can receive a lower classification cost.
Keywords/Search Tags:Cost Sensitive, Neural Networks, Static Analysis, Actionable Alert, Unactionable Alert
PDF Full Text Request
Related items