Font Size: a A A

Research On Directed Fuzzing Technology Of JavaScript Engine Based On Natural Language Processing

Posted on:2023-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z J WuFull Text:PDF
GTID:2568307025453454Subject:Computer technology
Abstract/Summary:PDF Full Text Request
JavaScript is widely used in browsers and other applications,and its potential security vulnerabilities have brought huge hidden dangers to enterprises and users.At present,the vulnerability detection technology for JavaScript engine is mainly fuzzing.However,the current fuzzing technology for JavaScript engine has the disadvantages of low syntax legitimacy or semantic validity of test cases and difficulty in improving code coverage in terms of available test case generation ability and code coverage.To solve these problems,this paper adopts natural language processing model to optimize test case generation technology and improve syntax legality and semantic validity of test cases.On this basis,this paper proposes a directed fuzzing technology based on rich-branch node optimization to solve the problem of incomplete and insufficient coverage of directed fuzzing code.Compared with AFLGo,the method proposed in this paper improves the code coverage significantly,and finds 6 undisclosed vulnerabilities.The main work and research results of this paper include:1.Aiming at the problem of low syntax legality and semantic validity of test cases generated by fuzzing technology of JavaScript engine,a method of generating test cases of JavaScript engine based on natural language processing is proposed.The method traverses the AST tree,builds the AST subtree sequence with height of 2,the subtree parent node sequence,the BERT language model vocabulary vocabularies and the BERT language model clause Sentence,uses the BERT language model to pre-train the AST subtree sequence of JavaScript test cases,and combines the residual network with MLM to perform fine-tuning,and finally the language model obtained is used to generate JavaScript language files.Compared with Montage,Code Alchemist and Superion,the effective test case generation ability is improved by 17.14%,73.22% and 299%.2.Aiming at the problem of incomplete and insufficient coverage of directed fuzzing code,an optimization method of directed fuzzing based on rich-branch nodes is proposed.In this method,the concept of rich-branch nodes is defined and the algorithm of extracting rich-branch nodes is given.The optimization method collects the coverage information of the target program in the running process,calculates the weights of covered functions and nodes in real time by combining CG and CFG of the target program,and generates a list of rich-branch nodes.According to the weights of rich-branch nodes,the seed energy allocation algorithm of AFLGo is optimized and improved.Compared with AFLGo,this optimization method improves the average code coverage of each targeted point by 56.79%,and has the same target reaching ability as AFLGo.3.Combining the test case generation method of JavaScript engine based on natural language processing and the optimization method of directed fuzzing based on rich-branch nodes,we have realized the directied fuzzing system JRFuzz of JavaScript engine based on natural language processing.Aiming at the limitation that the directed fuzzing can’t be randomly mutated because of the highly structured JavaScript program files in the mutation stage,A JavaScript seed deduplication method and a JavaScript seed mutation method based on AST tree are proposed,and the corresponding algorithms are given.JRFuzz can increase the code coverage on the premise of ensuring the directional ability.In a series of experiments,28 bugs are found,of which 22 are the same as the existing CVE,and the remaining 6 unknown bugs are newly discovered vulnerabilities.
Keywords/Search Tags:fuzzing, natural language processing, JavaScript Engine, code coverage rate, rich-branch node
PDF Full Text Request
Related items