Font Size: a A A

Neural networks with categorical valued inputs and applications to intrusion detection

Posted on:2005-10-02Degree:Ph.DType:Dissertation
University:University of Missouri - RollaCandidate:Novokhodko, AlexanderFull Text:PDF
GTID:1458390008998873Subject:Engineering
Abstract/Summary:
Neural networks, a powerful machine learning paradigm, have been successfully applied to a wide spectrum of practical problems. Being universal approximators, the neural networks work best with quantitative inputs, when different values of an input indicate different magnitudes (e.g., intensity). For ordinal inputs, when different values of an input signify order relation (e.g., position in a sequence), the neural networks are still able to do a decent job, but for categorical inputs, when different values of an input correspond to different qualities (e.g., shape), performance on real-life problems often suffers.; To feed categorical values to a neural network, they are enumerated either as real values or binary vectors. However, this encoding introduces false magnitude and order relationships to input data. To handle vectors of the categorical valued inputs, one can use a metric that discards the magnitude/order noise introduced by numerical encoding of the categorical values. One such metric is Hamming distance. The applicability of Hamming distance for neural networks has not adequately been studied. This work fills the gap and shows that probabilistic neural networks using Hamming distance with categorical inputs retain their asymptotic Bayes optimal properties.; Further, ensembles of probabilistic neural networks with Hamming distance kernels are used to classify sequences of system calls in a host-based intrusion detector. The networks are trained to detect rare occurrences of malicious program behavior among a dominant amount of normal data, while minimizing the false positive alarm rate. The performance is significantly better than other published neural network IDS results.
Keywords/Search Tags:Neural, Categorical, Inputs, Hamming distance
Related items