Font Size: a A A

Programming Languages Techniques for Stream Data Processing Applied in Security

Posted on:2016-02-25Degree:Ph.DType:Thesis
University:The University of Wisconsin - MadisonCandidate:Luchaup, DanielFull Text:PDF
GTID:2478390017984519Subject:Computer Science
Abstract/Summary:
Signature based Network Intrusion Detection and Prevention Systems (NIDPS) and format aware encryption use at their core finite-automata structures or algorithms to match against regular expressions and respectively to rank elements in a language.They also need to process data streams at increasingly high line speeds, which is a major challenge considering some limitations imposed by classical language and automata theory. For generic programs, these limitations are merely a matter of performance, but for security critical applications, these limitations affect their functionality and finding domain specific optimizations become a necessity.;In this thesis, I rely on domain specific insight to devise alternative methods that speculate about system state in order to go beyond the limits imposed by classic automata and language theory.;To address the problem of regular expression matching performed by NIDPS, I exploit the state locality of finite automata and low frequency of matches in two ways. First, I use speculation about the unknown future states to efficiently parallelize pattern matching. Second, I use speculation about how often the automata states will be traversed, to approximate and compress automata and to build new matching structures with better trade-off in terms of memory usage and matching speed.;To accelerate string ranking used by formatted encryption, I exploit the fact that in practice, even if a language representation is potentially ambiguous, for instance allowing strings with multiple accepting NFA paths or multiple context free grammar derivations, the degree of ambiguity is limited. When efficient ranking algorithms are not known or possible for certain language representations, I turn to ambiguous representations and create robust iterative algorithms that speculate that a non-ambiguous string is quickly encountered. This extends the application domain of formatted encryption from regular languages to context free and beyond.;I obtain new algorithms, data structures and even propose hardware to accelerate the scanning or encryption of data streams. My experiments and simulations show that the new algorithms have very good performance in the common case, and have bounded and well understood worst case performance. Therefore, they limit the potential power of algorithmic attacks.
Keywords/Search Tags:Language, Data, Automata, Encryption
Related items