Font Size: a A A

The Research On Approaches For Botnet Detection

Posted on:2011-10-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:B B WangFull Text:PDF
GTID:1118360305992009Subject:Information security
Abstract/Summary:PDF Full Text Request
Botnets are sophisticated platforms for large-scale attacks, which are composed of many compromised computers remotely controlled by attackers. They are becoming one of the most serious threats against cyber-security. The fundamental difference between botnets and traditional malware(such as Trojan, worm etc.) lies in that the attacker manipulates the zombies to launch malicious activities such as DDoS attacking, phishing and Spamming through one-to-many C&C(Command and Control Channels). Accurate identification of zombie hosts is the primary step of defending malicious behavior launched by botnets and has the very vital significance to network management and preventing cyber crime.Initially, botnets implemented centralized C&C control based on IRC (Internet Relay Chat) and HTTP (Hyper Text Transport Protocol) protocol. Then they explored distributed C&C via P2P (Peer-to-Peer) protocol to address the single point failure problem and increase robustness and concealment. Under a highly controllable environment, the signatures of bot programs can be frequently updated by botmaster, which degrades the efficiency of signature-based detection method. At present, there are two kinds of methods to identify zombie hosts. One is mining abnormal communication behavior to identify the zombie host by parsing the relevant communication protocols (such as IRC and HTTP botnets), the other relies on analyzing the similarity of C&C communication behavior and aggression behavior among the same kind of zombie hosts to identify zombie hosts (such as IRC, HTTP and P2P botnets). However, the former cannot deal with the encrypted communications; the latter is difficult to identify a single zombie host due to the precondition that there should be many zombie hosts of the same kind in the monitored network. The active measurement technology mentioned above is effective to identify P2P zombie hosts. However, it introduces much unnecessary flow traffic into network and has significant effects to the communication of normal peers.In this work, an active measurement method named AASD is proposed to identify Storm bots based on the anomaly relationship between logical address and the communication address. AASD can identify the storm botnet which parasitizes on current Overnet peer-to-peer network and is of enormous harm. Overnet is a kind of DHT network, where there exists one-to-one relationship between the node identifier and communication address (IP address and Port) theoretically. However, actually there exists one-to-many or many-to-one relationship between them. The former is called the identifier aliasing while the latter is called communication address aliasing. Based on analyzing the two phenomena, we discover two characteristics of these Storm Zombie entries:(i)The identifier and communication address in the same entries both have aliasing phenomenon; (ii)The corresponding IP addresses for each aliasing identifier are not concentrated in a specific subnet. After deploying the high-speed crawler of the Overnet Network on PlanetLab testbed, we can collect a large number of Index-Address for experiment. Then the Index-Address nodes which are both identifier aliasing and communicate address aliasing by set theory can be identified. We quantify the divergence of IP addresses using the maximum entropy theory and take the divergence as the important basis to identify zombies. If the divergence exceeds a predefined threshold, the aliasing identifier is the one used by Storm bots. Compared with the existing active detection methods, AASD can not only identify active Storm bots with 95% detection ratio, but also identify non-active Storm bots. What's more, AASD consumes 60% less bandwidth and effectively reduce the interference on the normal Overnet peers.A method named SIDPI to identify P2P zombie hosts based on the similarity distribution of interactive flow-aggregation is also proposed, which can identify the encrypted Storm zombie hosts. A flow-aggregation is a set of all flows of an (IP, Port) within a certain period of time. For non-zombie applications, the average length of all flow through the port monitored in different time windows are widely distributed while distributions of zombie hosts seem similar. We quantify the distance between two adjacent time windows using the theory of relative entropy and calculate the average packet length of flow-aggregation among multiple consecutive time windows. As a metric of identification, if the distance among distributions exceeds a predefined threshold, the host that deployed this (IP, Port) is identified as zombie. In order to reduce network traffic and improve the efficiency of measurement for similarity distribution of flow-aggregations, we have proposed a small flow-aggregation (SFAFA) algorithm to extract the suspicious (IP, Port) pairs. The advantages of SFAFA are as follows:(1) it can identify zombie utilizing encrypted communication; (2) it can identify the single zombie host in the supervised network, especially in the early time of the dissemination since it is independent of the similarity of the communication and aggressive behavior among multiple zombies. The results show that SFAFA algorithm can filter out more than 98% of the ports (IP, Port) in the network and improve the efficiency of the next procedure. And the accuracy of SIDPI can be up to 94.45% on average with both encrypted and unencrypted samples.We also proposed a BMBD algorithm to identify the IRC and HTTP zombie hosts in the supervised network by matching the network connection behavior models of the zombies. Analysis has shown that the different connections to zombie nodes are similar in the respect that there is a periodic interval between these connections. So BCM model (Bot Connection-Behavior Model) should be created by aggregating similar link using an unsupervised clustering method and mining potential period using cyclic correlation function. Then, after crawling the network border traffic, hosts can be identified as zombie hosts based on BCM pattern match. The results show that BDA is neither dependent on the content of communications among zombie hosts nor the group behaviors of zombie hosts. The detection accuracy of a single zombie node in monitored network is over 95%. Besides, BMBD can detect the variants of zombie whose BCM is known already and the detection accuracy reaches about 86.67%.
Keywords/Search Tags:Botnets, Abnormal Address Correspondence, Similarity of Interactive Flow-aggregation Distributions, Connection Behavior Model, Internet Relay Chat Protocol, Peer-to-Peer Network, Detection Method
PDF Full Text Request
Related items