Font Size: a A A

Network traffic analysis using stochastic grammars

Posted on:2013-01-28Degree:Ph.DType:Dissertation
University:Clemson UniversityCandidate:Lu, ChenFull Text:PDF
GTID:1458390008981600Subject:Engineering
Abstract/Summary:
Network traffic analysis is widely used to infer information from Internet traffic. This is possible even if the traffic is encrypted. Previous work uses traffic characteristics, such as port numbers, packet sizes, and frequency, without looking for more subtle patterns in the network traffic. In this work, we use stochastic grammars, hidden Markov models (HMMs) and probabilistic context-free grammars (PCFGs), as pattern recognition tools for traffic analysis.;HMMs are widely used for pattern recognition and detection. We use a HMM inference approach. With inferred HMMs, we use confidence intervals (CI) to detect if a data sequence matches the HMM. To compare HMMs, we define a normalized Markov metric. A statistical test is used to determine model equivalence. Our metric systematically removes the least likely events from both HMMs until the remaining models are statistically equivalent. This defines the distance between models. We extend the use of HMMs to PCFGs, which have more expressive power. We estimate PCFG production probabilities from data. A statistical test is used for detection.;We present three applications of HMM and PCFG detection to network traffic analysis. First, we infer the presence of protocol tunneling through Tor (the onion router) anonymization network. The Markov metric quantifies the similarity of network traffic HMMs in Tor to identify the protocol. It also measures communication noise in Tor network.;We use HMMs to detect centralized botnet traffic. We infer HMMs from botnet traffic data and detect botnet infections. Experimental results show that HMMs can accurately detect Zeus botnet traffic.;To hide their locations better, newer botnets have P2P control structures. Hierarchical P2P botnets contain recursive and hierarchical patterns. We use PCFGs to detect P2P botnet traffic. Experimentation on real-world traffic data shows that PCFGs can accurately differentiate between P2P botnet traffic and normal Internet traffic.
Keywords/Search Tags:Traffic, P2P, Hmms, Data, Pcfgs, Used
Related items