Font Size: a A A

Research On Error-Detection-Oriented Pointer Analysis Techniques

Posted on:2016-03-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:F LiuFull Text:PDF
GTID:1318330482975122Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Pointer analysis is a static program analysis technique, its goal is to determine statically which memory addresses(the locations of variables or functions) one pointer variable may point to, that is, to determine statically all possible runtime values of one pointer variable. The input of pointer analysis algorithm is the source code or some intermediate code representation of the program; and the corresponding output is points-to information of pointer variables. Because the extensive use of pointers(references) in C/C++(Java) programs, many static program analysis techniques have to depend on points-to information of pointer variables to resolve the indirection information in the programs; the results of pointer analysis directly influence the effectiveness of many static program analysis techniques; therefore, pointer analysis has always been an important topic because pointer analysis is a fundamental enabling technology for many static program analysis techniques. For the programs that are less dynamic, have less pointers and are small scale, existing pointer analysis techniques have been mature and could deal with these programs well. However, for the programs that are more dynamic and are Jarge scale, such as, large-scale programs and concurrent programs, when existing pointer analysis techniques deal with these programs, there still exist many problems; and current pointer analysis algorithms need further considerations and improvement:(1)how to improve the efficiency and the scalability of pointer analysis algorithms, without affecting the precision of points-to information; (2)how to improve the analysis precision of pointer analysis algorithms, meanwhile keeping the algorithms efficient and scalable as far as possible; (3)how to devise pointer analysis algorithms to apply to concurrent programs and so on.Error detection is an attractive topic in software engineering research field; for C/C++programs, in order to detect the ubiquitous memory errors, such as, memory leak, multiple release of the same memory space and null pointer dereference, error detection algorithms have to rely on points-to information of pointer variables to resolve the indirection information in the programs; the effectiveness of error detection algorithms directly depends on the precision, efficiency and scalability of underlying pointer analysis algorithms. Similar to sequential programs, for concurrent programs, in order to detect the ubiquitous errors, such as, data race, error detection algorithms also need points-to information of pointer variables to resolve the indirection information in the programs. For error detection algorithms, the ideal goal is to resolve the indirection information in the programs by means of the pointer analysis algorithms with good precision, high efficiency and good scalability.Considering the requirement of error detection algorithms, this thesis conducts deep research on inclusion-based pointer analysis and multithreaded pointer analysis. The research on inclusion-based pointer analysis is focused on how to devise efficient online cycle detection techniques; this is because online cycle detection techniques could significantly improve the efficiency and the scalability of pointer analysis algorithms, without affecting the precision of points-to information. By analyzing the shortcoming of LCD(lazy cycle detection), this thesis proposes three online cycle detection techniques(namely, ELCD, BootCD and ADD) to improve LCD. The research on multithreaded pointer analysis is focused on devising a multithreaded pointer analysis technique based on Petri; the technique is mainly based on the idea of causal dataflow analysis.The major contributions of this thesis are listed below:(1). Considering the shortcoming of LCD, we improve the "cycle effect" of LCD and propose an improved online cycle detection technique called ELCD(extended lazy cycle detection); Compared to LCD, ELCD identifies potential cycles in constraint graph based on more-refined "cycle effect". The "cycle effect" of ELCD is defined and computed based on not only points-to sets of two nodes on one edge but also the points-to set of the successor of destination node of the edge. Based on more-refined "cycle effect", ELCD could significantly reduce the number of cycle detection searches that find no cycles triggered by LCD. Experimental results show that ELCD can improve efficiency of LCD without affecting the precision of points-to information.(2). In order to improve the quality of cycle detection in LCD further, that is, in order to reduce the number of cycle detection searches that find no cycles triggered by LCD, we propose a new online cycle detection technique called BootCD(bootstrapped online cycle detection). BootCD is based on the idea of bootstrapping; BootCD first performs pointer analysis once using an efficient but imprecise pointer analysis algorithm; then combines with the points-to information to help a subsequent pointer analysis algorithm conduct online cycle detection and improves the efficiency of the subsequent pointer analysis algorithm. Specifically, BootCD combines with Steensgaard points-to information to define and construct a new directed graph called bootstrapping constraint graph; bootstrapping constraint graph is a superset of andersen constraint graph. BootCD utilizes the set of edges which belong to the cycles in bootstrapping constraint graph to conduct online cycle detection in andersen constraint graph. Compared with LCD, experimental results show that BootCD significantly improves the quality of cycle detection and helps reduce the overall analysis time; therefore, improves the efficiency of inclusion-based pointer analysis.(3). We propose two new notions, namely, bootstrapping constraint graph and constraint equivalence; and devise the construction of bootstrapping constraint graph and the computation of constraint equivalent variables; bootstrapping constraint graph is the basis for BootCD; and the benefits of constraint equivalence include (a)simplifying the bootstrapping constraint graph, that is, reducing the number of nodes and edges in bootstrapping constraint graph; (b)optimizing the construction of bootstrapping constraint graph; the optimized construction of bootstrapping constraint graph significantly reduces the time overhead and memory requirement of BootCD's offline analysis(namely, the time overhead and memory requirement of constructing the bootstrapping constraint graph and computing the edges in cycles of the bootstrapping constraint graph) and is helpful to improve the effectiveness and efficiency of BootCD.(4). In order to improve the efficiency of online cycle detection in LCD further, we propose a new online cycle detection technique called ADD (ancestor_descendant_dominator). Compared with BootCD, ADD is also based on the idea of bootstrapping. The difference between ADD and BootCD lies in the information that is used to bootstrap online cycle detection. BootCD utilizes the bootstrapping constraint graph and the edges in the cycles of the bootstrapping constraint graph as bootstrapping; however, ADD uses the structure information of the offline constraint graph as bootstrapping. Specifically, ADD first constructs offline constraint graph; then ADD computes and collapses the cycles in the offline constraint graph. Now, the offline constraint graph becomes a directed acyclic graph(DAG); next, in the DAG, for each node, ADD defines and computes Ancestor set and Descendant set; Ancestor/Descendant information is used to help subsequent pointer analysis process conduct online cycle detection; finally, based on the dominator information in offline constraint graph, ADD could identify top-level variables which are pointer equivalent; collapsing these pointer-equivalent top-level variables could reduce the cost of points-to information propagation among nodes during the subsequent analysis and is helpful to improve the efficiency of ADD further. Compared with LCD, primary experimental results show that ADD significantly improves the quality of cycle detection and helps reduce the overall analysis time; therefore, improves the efficiency of inclusion-based pointer analysis.(5). Based on causal dataflow analysis, we devise a multithreaded pointer analysis technique based on Petri net. Specifically, this technique first utilizes 1-safe petri net to represent control flow structure of multithreaded program; then explores partially ordered execution of petri net, and the points-to information of pointer is propagated along causal dependency of events; finally, pointer analysis problem of multithreaded program is converted to the marking coverability problem of petri net. Multithreaded pointer analysis technique based on Petri net provide a possible thought for deep research on multithreaded pointer analysis.
Keywords/Search Tags:Online cycle detection, Inclusion-based pointer analysis, cycle effect, Steensgaard analysis, offline constraint graph, dominator
PDF Full Text Request
Related items