| However,poorly-designed regexes can yield exponentially many matching steps,and lead to regex Denial-of-Service(ReDoS)attacks under well-conceived string in-puts.As a kind of algorithm complexity attacks,ReDoS has received increasingly attention from researchers.Regular expression(regex)with modern extensions is one of the most popular string processing tools.Featured with various grammatical extensions,regexes are widely used in crawlers,text editors,web applications,search engines,command-line utilities,databases,to name but a few.However,poorly-designed regexes can yield ex-ponentially many matching steps,and lead to regex Denial-of-Service(ReDoS)attacks under well-conceived string inputs.As a kind of algorithm complexity attacks,ReDoS has received increasingly attention from researchers.Existing ReDoS detection tools(static automata analysis,or dynamic black-box fuzzers)fall short when applied to extended regexes.If the ReDoS vulnerability re-quires a certain prefix to trigger,existing approach either misses it or requires unaf-fordable time to find it.We propose a three-phase gray-box analytical approach,RESCUE,based on the e-NFA model we defined for extended regexes,to address this problem.RESCUE sys-tematically seeds(by a genetic search),incubates(by another genetic search),and fi-nally pumps(by a regex-dedicated algorithm)for efficiently generating strings with maximized search time.We evaluated RESCUE against regexes from existing work,and applied RESCUE to open-source projects.The evaluation results show that RESCUE found 49%more ReDoS-vulnerable regexes compared with the best existing technique,and each regex can be checked in approximately one second on average.RESCUE also found ten previ-ously unknown ReDoS vulnerabilities,and some of these vulnerabilities are confirmed and fixed by the projects' developers. |