Font Size: a A A

Dynamic Testcase Generation Based Automatic Defect Mining For Binaries

Posted on:2011-04-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:G LiFull Text:PDF
GTID:1118360308985584Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Defect mining is important for software used efficiently and safely. Pre-release testing, the last phase of software testing, provides the best opportunity for defect detection before prelease. Testing pre-release binaries, integrated with third-party component, can detect defects, introduced by compiler optimization and code shuffler, which cannot be detected by other testing phases. However, there is still a lack of efficient, complete and easing bug fixes testing techniques or systems for pre-release testing. It becomes a research trend of applying the techniques for unit or component testing to systemic testing. Among them, dynamic testcase generation is a key choice because it is automatic, false-alarm free and easing bug fixes.This thesis studies how to adapt dynamic testcase generation to large-scale pre-release binaries. The architecture- and OS-independence, complete path constraint collection and performance optimization are researches, with the objective of building an efficient, accurate and retargetable dynamic testcase generation for pre-release binaries. The main contributions of this thesis are as follows.1. The current dynamic testcase generation system completely depends on the details of target ISA. This thesis designs a high performance meta ISA, which can both accurately describe the semantic of binaries, and simplify and accelerate dynamic testcase generation system built with it. With the meta ISA, our testing system, with a total of 209,000 lines of code, can be retargeted to a new ISA with only 1% code overhead.2. Existing dynamic testcase generation techniques usually generate a lot of testcases diverging from the target path method. This thesis presents the testcase generation based Dynamic Binary Accurate Path Covering (DBAPC) approach, the first one generating testcase accurately directing to the target path. First, the theory of complete feasible branch constraints is presented. Based on the theory, the Path Dynamic Symbolic Execution (PDSE) is defined and built for the complete branch constraint collection. Then, a satisfiablity modulo theory solver HSolver is supplied to generate testcase by solving the collected branch constraints. At last, the theory and collection of complete branch constraints for taint input is presented in order to accelerate large-scale pre-release binaries. It can be proved that DBAPC can test all taint-bound feasible branches with the least generated testcase.3. The DBAPC approach gets large and complex feasible branch constraints when dealing with the deep path of large-scale pre-release binaries. The time/space complexity of existing dynamic testcase generate system may increase exponentially, which fails the symbolic analysis. The thesis presents the DBAPC optimizations with linear time/space complexity. First, the presented single assignment taint transformation bounds the constraint presentation to the linear space complexity. Then, the taint single assignment DAG (TSADAG) that descripts the definition-use relation among taints is defined and built, based on the transformation. At last, the branch constraint is partitioned and simplified based on the coherence of target branch condition and the built TSADAG. Thus, the DBAPC optimization simplifies the constraints for feasible branch and decreases the overheads for HSovler.4. Tainted pointers make the dependence among variables complex and uncertain, symbolic taint propagation hard, and increased the difficulty of constraint analysis and collection. Existing dynamic testcase generation systems either ignore pointers or partially address the pointer with the help of imprecise point-to analysis, and thus generating a lot of diverged testcases. This thesis presents the Extended DBAPC (E-DBAPC) approach, the first one accurately dealing with tainted pointers, including precise point-to analysis, extending the TSADAG (E-TSADAG) for tainted pointers, and a complete branch constraint collection algorithm with tainted pointers, based on the TSADAG.5. Based on our DBAPC/E-DBAPC, we design and implement the Hunter system, the first accurate dynamic testcase generation system for the defect mining of large scale pre-release binaries. The Hunter system can efficiently deal with binaries of any ISA over any OSes. Hunter system adapts many techniques for high performance and good retargetablity, including mixing concrete and symbolic execution, and virtual machine event-based process/thread identification. Tests showed that, our Hunter system can accurately and efficiently detect all defects in Benchmark with known defects. At the same time, Hunter system detects 27 fatal defects from large-scale business software, such Word 2007. Three of them that can result in remote arbitrary code execution, received confirmation and great attention from the Microsoft and Baidu companies. Microsoft found the reason for the defects easily, with the help of the generated testcase, and patched quickly.
Keywords/Search Tags:Defect Detected, Software Testing, Dynamic Test Generation, Feasible Path, Point-to Analysis, Path Constraints
PDF Full Text Request
Related items