Font Size: a A A

Research On Anomaly Intrusion Detection Of Web Application Based On Data Mining

Posted on:2012-06-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F YuFull Text:PDF
GTID:1118330368984114Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As an important technology of network and information security, intrusion detection has become an important part of network security architecture. It can find out the behavior which violates security policy and signs of attack by analyzing the data collected from the critical point of computer system and network. Traditional misuse detection models the known attacks to form rules library and it can detect know attacks effectively. The main disadvantage of misuse detection is it cannot detect unknown attack, the missing rate is high and it needs update the signature library regularly. Anomaly detection sums up the characteristics of normal behavior profile, then training using the behavior profile to get the model of normal behavior. Once a behavior which deviates from the statistically normal behavior is found, the intrusion is happened. Anomaly detection can detect unknown attack and has good adaptability. The main disadvantage is its high false positive rate.Since data mining technology can mine normal and abnomal behavior model from vast amounts of audit data, not only reducing the heavy work of manual analysis and coding significantly, but also improving the adaptability of intrusion detection system. So data mining technology is used in the field of intrusion detection widely.Because of the differentce of system environment and the variety attack methods, taditional anomaly detection based on host or network has low efficiency, lacks of specific detection data source and description ability of detection model. It's difficult to effectively detect variable attacks. Therefore, this article presents a new application-based anomaly detection technology specifically for web applications, including vulnerability analysis of web application, data source selection, data model instruction and evaluation, various detection algorithms and other content. Although the methods proposed in this article is primarily targeted to web applications, its idea and method can also be used for gereral anomaly detection.Frequent pattern mining is an important area of dataming, and it's also the basis of association analysis. The dynamic maintenance of traditional FP-tree structure is complex, and the time and space efficiency of FP-growth is not high. A new NFP-tree algorithm is introduced. It removes the redundant operation of FP-growth and improves the efficiency of the algorithm. In frequent sequence pattern mining, the web application system call sequence is used as data source. By analyzing three fundamental structures in program flow:sequence, selection and iteration, a new anomaly detection method which is MWVP(Multi Wildcards Variable-length Pattern) is proposed. This algorithm follows the idea of TEIRESIAS and adds redundancy controlling. The redundant pattern can be found in generating models and the procedure stops in time. The efficiency of the algorithm is improved.In association analysis, the web access log file is used as the data source. The relationship of parameters in HTTP requests is analyzed using the association analysis method of data mining. Four anomaly detection models is presented and the anomaly score of each model is computed. The total anomaly score is computed using a weighted sum. This method can detect a variety of known and unknown attacks and the false positive rate is very low.Cluster analysis is an important method of data mining; it is useful for the analysis of the internal relationships of data points. From the perspective of data processing, data corresponding to intrusion behavior can be marked as anomaly data. The data of normal and abnormal behaviors have different characteristics. In some feature space, the corresponding data of intrusion and normal behavior will be separated from each other, and the data of the same type of behaviors (intrusion or normal) will be aggregated together. Based on this consideration, an outlier mining algorithm based on degree of isolation is presented. The HTTP requests of web access log file are used as data source. The distance function of each type of parameters and the statistical properties is defined. By mapping these distance using the mean and standard deviation, a uniform distance formula is proposed. The center of the cluster and the degree of isolation are also defined. Based on the idea of statistical outlier mining method, an outlier mining method is presented using the approximate normal distribution model. Experimental results show that the new algorithm can detect a variety of Web attacks, and the detection accuracy is higher compared with another two algorithms.
Keywords/Search Tags:Data Mining, Web Application, Intrusion Detection, Anomaly Detection, Association Analysis, Frequent Pattern Mining, Cluster Analysis
PDF Full Text Request
Related items