Font Size: a A A

An Architecture For An Integrated Innate And Adaptive Artificial Immune System Applied To Network Intrusion Detection

Posted on:2014-02-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Richard Maina RimiruFull Text:PDF
GTID:1228330434952094Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The biological Human Immune System (HIS) has become of great interest to computer scientists as it provides a unique and fascinating computational paradigm for solving complex problems.The application of theoretical immunology and observed immune functions, its principles, and its models to solving complex problems has gradually developed into a new research field, called Artificial Immune System (AIS). AIS-based algorithms have since found numerous application areas ranging from; computer security, clustering and classification, optimization, robotics etc.While there have been a lot of successful applications of AIS, there are few exemplars that really stand out as instances of successfully applying an AIS algorithm to hard, real-world problems, or of AIS being used in earnest in industry. This has triggered research on the need to examine the HIS metaphor in detail, in an effort to understand and capitalize on those features of the metaphor that would help distinguish it from other existing methodologies, increasing its significance as a problem solving mechanism and ultimately leading to its increased adoption.HIS is known to keep the body healthy by protecting it from disease causing agents commonly referred to as pathogens.The architecture of HIS is multi-layered with defences at several levels. Though the skin, and some physiological conditions such as chemical and biological factors like fatty acids in sweat, lysozome etc.act as barriers to pathogens and are considered part of the multi-layer architecture,the HIS is mainly taken to be composed of the innate and adaptive subsystems.The innate subsystem is characterized mainly as a first line of defence that provides rapid, non-specific response to pathogens.The adaptive subsystem on the other hand is described as a system capable of highly specific recognition and remembrance of pathogen.Past AIS research endeavours seem to have concentrated mostly on the adaptive immune sub-system until recently, around2005or so,when with AIS research in an impasse, new concepts had to be sought. As with immunology research itself, recent works have since shifted focus to understanding the role of the innate subsystem in the functioning of HIS. Large numbers of papers have been published in immunology works that have uncovered many mechanisms in which the innate system interacts with the adaptive system.They have characterized the role of the innate system as;host defence in the early stages of infection through non-specific recognition of a pathogen, induction of the adaptive immune response,and determination of the type of adaptive response.This presents the innate system as the major controller of the adaptive system.This new understanding requires that the innate system concepts be integrated in the overall design of AIS-based models if the desired properties of the HIS are to be achieved, and if AIS research is to develop from the impasse it has been lately due to the use of concepts inspired from only the adaptive immune system. The hypothesis of this thesis therefore is that, by incorporating the innate system concepts into the design of AISs, most of the desired properties of the HIS can be achieved and thus the performance of the resulting system enhanced. This is further based on the views that HIS are two interacting subsystems;the innate and the adaptive immune systems,and that the innate system largely controls the adaptive system.The research work presented in this thesis mainly investigates innate system concepts and examines how inferences drawn from their operations can be used in the design of an AIS model.Several salient features of HIS,involved in the detection and response of intruding pathogens are carefully studied and the possibility and advantages of adopting these features for development of an AIS model reviewed and assessed.Specifically, the proposed architecture and therefore the focus of this thesis is on the incorporation of the following HIS concepts;i. use and characterization of’danger signals’by the innate layer cells;ii. Nature and interaction of Pattern Recognition Receptors (PRR’s) and Pathogen Associated Molecular Patterns (PAMPs) by innate layer cells; iii.Innate systems’Dendritic Cells (DCs) processing and presentation of antigens;and iv.Primary and secondary response mechanisms of the adaptive layer cells.These mechanisms are seen to be instrumental in the detection and removal of invading pathogens in HIS.Thus the contributions of this thesis are as follows:A detailed review of the biology related to the aforementioned concepts is undertaken and a simple abstraction of the HIS concepts developed.The abstraction is then used to develop INIAIS (INtegrated Innate and Adaptive Immune System) architecture that incorporates the aforementioned concepts in a single architecture with each of the corresponding components providing functionalities analogous to their HIS counterparts.Specifically the following architecture is proposed in the design of INIAIS model:it should have two interacting layers, an innate layer and adaptive layer.The innate layer provides the interface between the system and the environment in which it operates in.Three key processes are included at the innate level;danger signal detection and response mechanism,PAMP detection and response mechanism as well as a DC antigen processing and presentation process.The adaptive layer provides two processes; primary and secondary response mechanisms.Assuming a two-class discrimination problem say, self versus nonself or normal versus abnormal,where the objective of a system is to distinguish the occurrence of one from the other then the identified processes provide the following;Danger signals provide an indicator to the model as to possibility of an abnormality,all instances of an abnormal class should possess some danger signals while the occurrence of a normal class may or may not. In essence occurrences of an abnormal class must trigger some danger signal while that of a normal class may or may not trigger any.The logic behind this is two-fold; Firstly,by limiting response to only those instances that have danger signals, an implicit adaptation to changing normal that possesses no danger signals is achieved and secondly, danger signals concept also allows for some filtering of data, reducing data passed to subsequent layers for further processing, improving scalability, and promoting lightweight modules.An implicit adaptation to changing self helps overcome a previous limitation of AIS-models that used a self-nonself discrimination principle.With the self-nonself principle, it was assumed that the set of normal instances were known and detectors were trained to detect any occurrences not previously encountered. For a rarely changing environment the self-nonself principle worked quite well but with a continuously changing environment there was an explicit need to keep updating the set of self to help reduce false alarms.PAMP signals detection and response mechanism provide INIAIS with a detection and response ability at the innate layer. The HIS innate cells use PRRs which recognize evolutionary-conserved PAMPs features, known to identify a broad class of pathogens.This provides the innate cells with a perfect ability to detect and eliminate those bacteria or viruses that manage to get into the body before they have a chance" to reproduce and proliferate, providing early protecting to the body from pathogens.INIAIS model is similarly designed to provide detection and mount an immediate response to occurrence of abnormal class instances possessing PAMPs.Similar to the HIS,PAMPs are defined as signals possessed only by instances of abnormal class.As a result PAMPs provide a second filtering point for the input data, further improving the scalability,robustness, self-organization and lightweight properties of the resulting system.Robustness is improved by the fact that detection in this case is provided at two distinct layers as opposed to previous AIS models that provided a single layer for detection purposes thus improving the fault tolerance ability of INIAIS.If an abnormal occurrence triggers some kind of response,PAMPs provide INIAIS with a mechanism to automate such a response without interfering with occurrence of normal instances,this account for the self-organization aspect.Reduction of data at this level ensures that only a subset of the input data is passed on to the adaptive layer for further processing achieving the lightweight property to some extent.Most current AIS models include all attributes present in an input-data instance when developing detectors to help with the discrimination process.For a data input with a high number of attributes it results in high complexity of the search space.DC antigen processing,provides the INIAIS model with a feature extraction mechanism.Feature reduction helps with reduction of space and time complexities for a given problem by providing ways of reducing the input-data to lesser attributes that should readily help in detection of some given abnormal class instances.The logic behind this is that different abnormal class instances should be readily detectable using only a subset of attributes present for a given input. This is more applicable especially in cases where abnormal class instances can be further grouped to smaller sub-classes that significantly differ in their structure or behaviour.In such cases,DC antigen processing helps develop different feature-vectors that can then be used by different groups of detectors at the adaptive layer to separately identify occurrences of abnormal instances.This leads to different groups of detector sets at the adaptive layer specialized in detection of particular occurrences of abnormal class.Compared to a design based on full set of attributes,that develops only one set of detectors,DC antigen processing provides INIAIS model design with increased fault tolerance ability, as failure of one detector set does not render the system ineffective.DC presentation requires that only the necessary subset of detectors be involved in the detection process, depending on the kind of danger present.Involving just a subset of the total detectors implies that any processing done at the adaptive layer will involve fewer operations,significantly improving the efficiency of the resulting system.Adaptive layer processes are thus required to process only a subset of input-data, i.e.the remaining instances after the filtering processes at the innate layer, using different sets of detectors, chosen depending on the type of danger signal present. Primary response mechanism is used to learn the adaptive layer detectors to previously unseen occurrences of abnormal class instances while secondary response mechanism provide for memory of previously encountered abnormal instances.As noted, unlike previous models that required the whole detector repertoire to be used for the detection purposes,INIAIS uses different subsets trained to detect different abnormal sub-classes instances.This greatly reduces the number of operations undertaken during the detection process, improving the efficiency of the detection process.It also ensures that detectors at this level remain diversified as different detector sets are maintained for each sub-class present. To help validate and evaluate INIAIS architecture, a real-world problem that can greatly benefit from the application of HIS concepts is considered.HIS can be regarded to be a remarkably efficient and powerful information processing system which operates in a dynamic and unpredictable environment in which it is necessary to react to changes in a timely manner.This suggests as potential application areas for the immune system metaphor, those problems seeking robust and ’good enough’ solutions to problems occurring in dynamic environments that allow a system to continue functioning satisfactorily.These features are characteristic of a number of real-world problem domains,however in this thesis a network intrusion detection problem is chosen.As a field of computer security, intrusion detection’s goal is to monitor the activity of an information system for the occurrence of malicious activities intended to violate the security policy governing confidentiality, integrity and availability of services and data.Clearly, the problems faced by the immune system of recognizing and eliminating pathogens on a relatively short timescale are analogous to those faced in the intrusion detection domain.An effective network-based IDS is considered to be robust in that it should have multiple detection points,which are tolerant enough against attack and any system faults on IDS.It should also be configurable in that it should be able to configure itself easily to the local requirements of each host or each network component. It should be easy to extend the scope of IDS monitoring by and for new hosts easily and simply regardless of operating systems.It is also necessary for it to achieve reliable scalability to gather and analyze a high-volume of audit data correctly from distributed hosts,as well as be dynamically adjusted in order to detect dynamically changing network intrusions.It also should collectively monitor multiple events generated on various hosts to integrate sufficient evidence and to identify the correlation between multiple events,and be simple and lightweight enough to impose a low overhead on the monitored host systems and network.A prototype system,INIAIS-NIDS,is designed and developed based on the INIAIS architecture and applied to network intrusion detection.The prototype is evaluated and the significance of each of the abstracted HIS concepts incorporated in INIAIS analysed. Specifically several hypotheses are formulated and tested.Hypothesis I:"It is possible to develop a set of danger signals that could be used to alert a network intrusion system of all possible attacks". In other words,is the application of danger signals as interpreted in INIAIS model feasible?The effect of applying danger signals as interpreted is also evaluated in the resulting system.To help determine what constitutes a danger signal a guiding principle that "the presence of danger signals may or may not indicate an anomalous situation(an attack),however the probability of an anomaly is higher than under normal circumstances" is adopted.Granular computing concepts are then used in the development of the different danger signals.In essence any granule that has higher attack instances compared to normal occurrences is considered a candidate danger signal.The results provides evidence that supports the hypothesis and also show that use of danger signals allows early filtering of significant percentage of normal traffic, data that requires no further processing.Reducing the number of normal instances early,help not only in reducing false alarm rates of the resulting system but also with scalability of the system to environments with high data inputs,as fewer normal instances are involved in further processing.In a real world set up, it is expected that a very small fraction of traffic would possess danger signals,as such much normal traffic would be filtered early avoiding the creation of a bottleneck by the NIDS.Hypothesis II is concerned with PAMP signals and states that "It is possible to develop a set of PAMP signals that could be used to identify only attacks i.e.is the concept of PAMPs feasible as interpreted in the INIAIS model?" As with the danger signals granular computing concepts are used to develop the PAMP signals.Essentially PAMP signals correspond to danger signals expressed at a finer granular level that limit them to attack instances only i.e.PAMPs are exclusively associated with anomalous situations.Results indicate that the hypothesis is valid as formulated.In addition, Use of PAMPs presents a more compact and efficient way to detect and eliminate majority of the attacks (in this case,over90%) early from the system.PAMPs are more general compared to the very specific rules or signatures used in misuse-based systems,as they present patterns common in a group of attacks as opposed to single attack signatures used for misuse-based systems,thus it can be argued that they present a more effective way to eliminate majority of the attacks early in the system.PAMPs are not only compact signatures but generalize well to closely related attacks,presenting a system with some anomaly detection ability.As an example of their generalization, PAMPs derived from only the training data generalized very well when applied to previously unseen test data, effectively detecting over90%of all attacks present.PAMPs also provide INIAIS-NIDS with sufficient grounds to automate responses to attacks detected at this stage,ensuring that such automated responses do not introduce any undesired effect. Thus PAMPs further help in the filtering of data early in the detection process reducing the likelihood of INIAIS-NIDS creating a bottleneck in the monitored network. It also allows for INIAIS-NIDS to scale well to large amount of data (traffic).Hypothesis III is involved with DC antigen processing and presentation concept.It is not only concerned with the feasibility of the concept but also the implication of using the same in the design of a NIDS.It states that "DCs antigen processing and presentation provides for a mechanism to improve efficiency and effectiveness of the detection process at the adaptive layer of an AIS-based NIDS".This is in comparison to current AIS approaches,where the detection process employs detectors that use all attributes present in an input data(antigen) and all detectors present in the repertoire are involved in the detection process.Each danger signal dsi is associated with a given attack class classj, based on the attack instances detected by the given danger signal.A collection of all danger signals for a given attack class are then used to identify the attributes of concern for the given class.The identified attributes are then used as a basis for the antigen-peptide (feature-vector) developed for that class.Given that different classes will ultimately use different subset of features from the input data, different class detectors will therefore be specialized to identification of different attack instances based on only the most relevant attributes.On the other hand, an attack signalled by a given danger signal will only activate the necessary subset of detectors at the adaptive layer for the detection process.The experiments results of this proposed approach are then compared with two different mechanisms for full-attribute detectors, hyper-cubes and partial matching, as used with current approaches.The results show that the detectors developed from the proposed approach are very discriminative as very few are seen to match the self-peptides.This allows the detectors to significantly provide a better coverage of the attack instances as opposed to current approaches. Additionally, individual group detector sets also provide some good coverage of the attack instances when used on their own (independently) thus increasing the robustness of INIAIS-NIDS.The proposed approach also helps reduce the complexity of the search space compared to current approaches, as each detector set is left to concentrate on just a smaller part of the input data feature-vector. Involving just a subset of the adaptive layer detectors implies that any processing done at this stage will involve fewer operations and less memory space would be needed during the detection process,significantly improving the efficiency of the resulting system. As such the hypothesis was found to be valid as stated.Detectors developed at the adaptive layer are highly specialized to help reduce false alarm rates as much as possible. However, development of such highly specialized detectors implies that their generalization to previously unseen attacks is very poor, analogous to the cells of the adaptive immune system. As such primary response mechanisms need to be incorporated to help detectors adapt to previously unknown attacks.Experiments are undertaken where those detectors with the highest "affinity"(highest matching-value)for a given antigen-peptide are adapted.The "adaptation" process involves use of a mutation operator.Once a match is found, the matching detector undergoes a cloning operation, where copies of the same are created. One of the clones is promoted to a memory detector whilst the others are further mutated.Experiment results show that once a detector set has been developed based on the training data, copies of these detectors can be created and mutation applied to these copies to improve the resulting detectors’sets anomaly detection capability, analogous to hyper-mutation of B cells of HIS.In search space terminology, this is like ’exploring’the near regions of a given detector, ensuring that, although the detectors developed at the adaptive layer are very specific,they still cover wider "surroundings" of known attacks.However it was noted that mutation operator applied randomly (at random locations of a detectors feature-vector) to the clones was not very effective.Specifically the results provide some evidence that a detector set that covers known nonself adequately,should be cloned and mutated as a final step during the training period to provide for a better adaptability to unknown nonself. However, it also suggests the need for:a well guided mutation process if effective detectors are to be generated and a guided method to identify nonself to adapt to,especially if the whole process is to continue without any human intervention.Memory detectors are used to provide a secondary response mechanism much like the HIS cells that respond readily to a re-exposure to the same antigen.In this case they provide INIAIS-NIDS with a first-line adaptive layer defence against attacks.Hypothesis IV is then used to evaluate the performance of an AIS system that incorporates innate layer mechanisms to that based on adaptive layer mechanisms alone.It states that "INIAIS-NIDS model as a whole improves the performance of an AIS model that does not incorporate innate layer concepts". The performance of the implemented INIAIS-NIDS prototype is compared against a negative selection algorithm (NSA)implementation.NSA remains the most established and widely used AIS-based mechanism applied to intrusion detection and as such acts as a point of reference to most algorithms aimed at improving AIS-based solutions.Several experiments are carried out and different performance measures used for evaluation. From the experimental results it is evident that hypothesis IV holds as stated. Also some of the problems associated with a one-level detection approach are highlighted, which include:the separation between generalization and specialization of detectors is hard to achieve;For a highly dynamic environment, it is very hard to eliminate all detectors that react against the self-antigens, indicating there will always be significantly high false positive alarm rates in such an approach;and when the data volume is large and has a very high dimensionality, then the number of detectors needed to provide some good performance increases exponentially. INIAIS-NIDS model provides some solutions to these problems by:Using a two-layer approach, a clear separation between the generalization and specialization of the detectors used is provided; Using a danger signals approach to activating the detection process, the need for self-nonself discrimination as the central concept in the detection process is overcome, and thus the system is allowed to adapt implicitly to "normal" dynamic environments;Using the danger signals and generalized PAMP signatures early in the system helps reduce the number of instance significantly, thus further processing involves just a small proportion of the initial volume and DC’s processing helps in reduction of high dimensionality, significantly reducing the complexity of the resulting coverage space.In general the INIAIS-NIDS model design introduces major variations in the development of AIS-based NIDS compared to those in literature.The model allows for real-time application and at the same time achieves most of the AIS features that have since been considered advantageous in the design and development of novel IDS.A re-evaluation against features highlighted for effective NIDS reveals the following;That INIAIS-NIDS model incorporates "sensors" in three different components of its design, each used for a distinct functionality.This allows for each component to have a well defined role in the overall functionality of the resulting system. At each component level some filtering does take place, reducing the amount of data to be processed by subsequent components.This makes the resulting system to have a lightweight property. The innate and adaptive layers of the model are both designed to detect and respond to some of the non-self antigens allowing for several layers of detection and response.Thus the system can still perform, though in a limited capacity, with either of the layers resulting in a system that is truly multi-layered and with disposable components as such more robust compared to when detection takes place only in the adaptive layer as has been the case with most AIS based systems. The danger signal concept of the innate layer requires that an occurrence of nonself must have some danger sign(s), which then trigger the detection system. A danger sign is taken to be any occurrence or a feature-value (or combination thereof) that is likely to represent an attack. Given that an intrusion must exhibit some form of a pattern, for a misuse-based IDS,or result in a significant deviation in behaviour from that normally observed in the system, for anomaly-based IDS,it logically follows that such a requirement is in fact an obvious necessity.This interpretation allows the nonself space, to be bounded to a subset of the set created by union of attribute-values used to define the danger signals.By further restricting the bounded region to the subsets that only contain instances of nonself, a smaller subset can be derived. Each of the attribute-values or a combination of values used to define the resulting smaller subset can be used as PAMPs as they are guaranteed to be present only in non-self antigens.This creates a relationship between danger signals and PAMPs and allows for a more efficient processing to be developed based on this relationship.It also forms a basis to fully automate the response of antigens found in the smaller subset defined by PAMPs, allowing the resulting system to have some level of self-organization.Minimal amount of data that represents the set of antigens that cannot be easily differentiated, are then passed to the adaptive layer for further processing, greatly improving the scalability of the system.Antigens that do not possess any danger signs are always treated as part of self and do not trigger any immune reaction. In this case, the self space is not restricted to the set of normally occurring antigens as has been the case with previous AIS implementations, hence making it adapt easily to a dynamic environment where the self antigens keep changing continually without resulting necessarily in an increase of false alarms.By allowing subset specialization of the adaptive layer cells to detection of a given class of attacks, diversity of the repertoire is guaranteed.INIAIS-NIDS provides some good performance compared to existing AIS models applied to the same domain. In particular, the danger signals developed are able to detect all the intrusions cases while the PAMP signals provide a very good early detection and response performance.Antigen processing and presentation component also showed remarkable improvement in performance compared to a case where the same processes are not implemented. The need to incorporate primary response mechanism to help adaptive layer detectors detect previously unseen attacks,thereby reducing the chances of missed attacks and increasing the overall performance of the system is also highlighted and some insight on how the adaptability problem could be reduced provided. In conclusion, the overall aim of this research is to examine the HIS metaphor in detail,in an effort to understand and capitalize on those features of the metaphor that would help distinguish AIS from other existing methodologies, increasing its significance as a problem solving mechanism and ultimately lead to its increased adoption in many domains.The success of the INIAIS-NIDS model on the problem considered, confirms the utility of the metaphor developed in this work. In particular, this thesis provides evidence that use of an AIS model which incorporates both innate and adaptive immune system mechanisms improves the performance of current AIS approaches by:ⅰ. developing a truly layered architecture that provides for a separation of tasks achieved at each layer, at the same time allowing the layers to complement each other, thus improving the scalability and robustness of the resulting system. ⅱ.Use of danger signal concepts allows for implicit adaptation to changing self and allows filtering of self data early from the system, reducing possibilities of false alarms as well as improving the scalability of the resulting system. ⅲ.Use of PAMP provides for a compact and efficient way of developing and presenting generalized patterns for a given problem, further improving on the scalability of the system. ⅳ. Use of DCs antigen processing and presentation concepts provides for some form of feature selection mechanism which helps with reducing space and manipulation complexity,especially when data dimensionality is large.This concept also provides the innate layer with some form of control of the adaptive layer, determining the kind of response to be invoked.ⅴ.Use of several subsets of detectors at the adaptive layer helps increase the robustness of the resulting system as well as provide for a more efficient and effective processing capabilities for the adaptive layer.ⅵ.Proposed approach also provides some insight as to how existing models in AIS can be integrated together in development of more effective and efficient AIS- based solutions, as opposed to their continued implementation in a piecemeal approach.
Keywords/Search Tags:Artificial Immune System Architecture, Artificial ImmuneSystems, Network Intrusion Detection, Danger Theory, Dendritic Cells, PAMPs, PRRs, Innate Immune System, Adaptive Immune System, Granular Computing
PDF Full Text Request
Related items