Font Size: a A A

A Research On Vulnerability Discovery,Identification And Diagnosis

Posted on:2020-04-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:D L MuFull Text:PDF
GTID:1368330605450423Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Software systems are expanding into every aspect of the real world,therefore se-curity vulnerabilities in software systems are posing a serious threat to users,organi-zations and even nations.Due to the increasing complexity of functionality,software systems inevitably contain defects despite developers' best efforts.Software vendors set up their in-house testing team and employ various methods,including Fuzz Testing to find vulnerabilities in the software systems.However,while software systems have more increasing complexity and the releasing cycles are becoming shorter over time,it is no longer feasible for in-house teams to discover all possible vulnerabilities before a software release.When these defects are triggered,a program typically crashes and terminates abnormally,which severely affects the robustness of software systems and user experience.To identify and remove those vulnerabilities in software systems,besides Fuzz Testing,software vendors also seek for the following two new strategies.On one hand,an increasing number of software vendors have begun to encourage anyone on the Inter-net(e.g.,white hat hackers,security analysts,and even regular software users)to report a vulnerability via vulnerability reporting websites or "bug bounty" programs.Software developers then try to reproduce and identify the existing of those software vulnerabil-ities.Through multiple vulnerability reproduction,software developers could easily debug and pinpoint the root cause of the vulnerability.On the other hand,software vendors automatically collect software crash reports and thus identify and fix the un-derlying vulnerabilities.Software developers perform advanced analysis on the infor-mation of crash reports,pinpoint the vulnerabilities that lead to software failure,and the root causes of those vulnerabilities.As a result,when software vendors retrieve the vulnerability details,they could develop the corresponding patches and fix the defects.Therefore,for one security vulnerability,there are three stages-software vulnerability discovery,software vulnerability reproduction,software failure analysis in which we could facilitate to remove it from software systems.At the same time,performance and effectiveness improvement plays an important role in the removal of those security vulnerabilities.In those three stages,we employ several new approaches,for example,hardware assistance,deep learning,to improve the utility to identify and remove vulnerabilities in software systems.The existing binary-only fuzzing based on code coverage largely caused by the heavy dynamic instrumentation,which causes the performance is much lower than the source-available fuzzing.In the stage of vulnerability discovery,we design PTrix,an efficient hardware-assisted fuzzing tool which takes the advantage of Intel PT to replace the heavy dynamic instrumentation to collect code coverage,and thus improve the performance of binary-only fuzzing and explore new code space.In the vulnerability reproduction,we perform the first empirical analysis on a wide range of real-world security vulnerabilities with the goal of quantifying their reproducibil-ity.From experiments,we find that vulnerability reports generally miss all kinds of information,and security vulnerabilities in the real world have lower reproducibility.By widely crowdsourcing the information gathering,security analysts could increase the reproduction success rate but still,face key challenges to troubleshoot the non-reproducible cases.To further explore solutions,we surveyed hackers,researchers,and engineers who have extensive domain expertise in software security.Going be-yond Internet-scale crowd-sourcing,we find that,security professionals heavily rely on manual debugging and speculative guessing to infer the missed information.Fi-nally,in the stage of software failure diagnosis,POMP leverages Intel PT to record the control flow during program execution,and then recover the data flow and program state with reverse execution,thus locating the root cause pertaining to software crash.POMP utilizes hypothesis testing to identify the memory alias relationship.However,the computation complexity of hypothesis testing is exponential,which highly affects the effectiveness and efficiency of software failure analysis.We explore two approaches to improve the utility of memory alias analysis.First,we design POMP++ and intro-duce a reverse execution mechanism to construct the data flow that a program followed prior to its crash.Furthermore,POMP++utilizes Value-set Analysis,which helps to verify memory alias relation,to improve the ability of data flow recovery.Second,we develop RENN,which employs a recurrent neural network(RNN)to learn the binary code pattern pertaining to memory accesses.It then infers the memory region accessed by memory references.Since memory references to different regions naturally indicate a non-alias relationship,our neural architecture can greatly reduce the burden of doing hypothesis testing to track down non-alias relation in binary code.Evaluating our works with the state-of-art researches,our results show,given the same amount of time,PTrix achieves a significantly higher fuzzing speed and reaches into code regions missed by the other fuzzers.In addition,PTrix identifies 35 new vulnerabilities in a set of previously well-fuzzed binaries,showing its ability to com-plement existing fuzzers;POMP++could pinpoint the root cause of vulnerabilities in most cases from the real world,and can accurately and efficiently pinpoint program statements that truly contribute to the crashes,making failure diagnosis significantly convenient;RENN can significantly improve the efficiency of locating the root cause for the crashes.Compared to a state-of-the-art technique,RENN has 36.25%faster execution time on average,detects an average of 21.35%more non-alias pairs,and successfully identified the root cause of more cases.
Keywords/Search Tags:Vulnerability Discovery, Fuzz Testing, Software Vulnerability Reproduction, Software Failure Analysis, Value-set Analysis, Deep Learning
PDF Full Text Request
Related items