Development And Testing Technology For Software Engineering Data Mining

Posted on:2014-11-27

Degree:Doctor

Type:Dissertation

Country:China

Candidate:S Huang

Full Text:PDF

GTID:1108330434973396

Subject:Computer software and theory

Abstract/Summary:

In the enterprise software development environment today, software engineering tools and software engineering collaboration environments have been widely deployed and used. In the software life cycle, a large number of data are accumulated, such as software static history information library, software information base, and the software code base. The above data in the enterprise software process is mainly used for historical defects review, historical code review, and archiving.Currently, lots of software engineering problems can not be well addressed by the traditional software engineering model, or software engineering tools, such as unstructured requirment documents management and analytics, team collaboration management, fast software comprehension, automaticly programming, etc. However, they can be resolved by mining the accumulated data from software engineering.The software life cycle stages very much, where the software development and test are the most important parts. If we can have more intelligence to improve the efficiency of software development through software comprehension and code remmandataion, software test through regression test suite reduction with coverage ensurance, there is no doubt that it will greatly improve the efficiency of software engineering. Thus, in this paper, we focus on solving regarding issues in the real project so as to improve the efficiency of software development and test through mining of software engineering data, including three problems to address, software comprehension, software development, and regression testing. Specifically, we investigate the following problems and make the following contributions.1. This paper presents a new visualization method for two phase hierarchical clustering to support code comprehension. The first phase is based on call entrance clustering while the second phase is the clustering based on PageRank extension. The clustering results could be visualized with multi-granularity software modulesâ€™ dependency. Through the trial of professional programmersâ€™verification, this method can improve the efficiency of the software comprehension.2. This paper is the first to use frequent subtree mining over XML configuration file in the J2EE applications to automatically recommend XML application configuration file in these applications. The frequent subtree mining method embeded with the characteristics of the XML tree in the XML configuration file greatly improved the efficiency of frequent subtree mining. As the experiment results show, the method by automatically generating the XML substructure, and displaying sample code, can help programmers to improve the programing efficiency of XML configuration files.3. This paper is the first to mine the correlation relationship between the XML configuration file and the context code for automatic elements value and attributes value recommendation on XML configuration files in J2EE applications. As the experiment results show, this method can automatically generate reusable XML element value or attribute value and also detect element value or attribute value vilations in compiling phase, thus greatly improving the coding efficiency of XML configuration files..4. This paper is the first to propose safe regression test selection method for XML configuration framework based J2EE applications. This paper proposes an end-to-end safe regression testing solution for J2EE applications by providing three unique features-hybrid test-case tracing, unified change identification and regarding safe regression test seletion-that are not addressed by existing approaches. From the evaluation over real projects, we can find that this method is able to select out all the regression test cases with the potential to detect regression defects in XML configuration framework based J2EE applications.5. This paper futher presents a method of optimized regression test selection. Based on the heuritics and dynamic real-time test feedback, this method will futher classify and prioritize the regression test suite. From the evaluation over one real project, we can find that this approach can ensure change coverage, and reduce regression test cost with time and resource limitation.

Keywords/Search Tags:

Program Comprehension, Code Recommendation, Regression TestSeletion, Clustering, Frequent Subtree Pattern, Association Rules

Related items

1	Study On Frequent Subtree Mining And Its Application In XML Mining
2	Algorithm For Mining Association Rules Based On Clustering
3	An Algorithm And Context Analysis Of Mining Frequent Closet Itemsets
4	Study And Design On The Algorithms Of Mining Association Rules
5	Research On Frequent Subtree Mining And Pruning Strategies
6	The Research On The Related Problems Of Association Rule Mining
7	The Research Of Association Rules Algorithm Based On Frequent Pattern Tree
8	Research And Application Of An Improved Algorithm For Association Rules
9	The Research On Frequent Subtrees Mining And Corresponding Techniques
10	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application In Simulation System