Font Size: a A A

Algorithms Of Sequence Analysis And Theirs Applications In Intrusion Tolerance

Posted on:2007-03-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:F ZhaoFull Text:PDF
GTID:1118360242961984Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Progress in digital data acquisition, distribution, retrieval and storage technology has resulted in the growth of massive databases. One of the greatest challenges that organizations and individuals encountered is how to convert their collections of rapidly expanding data into accessible and actionable knowledge. The attempts to overcome these hurdles have gathered researchers from different areas, such as statistics, machine learning, and databases, which resulted in a new research field, so called Data Mining. Sequence analysis is a fundamental and important problem in data mining. It occurs in the discovery of association rules, strong rules, correlations, multidimensional patterns, and many other important discovery tasks. Sequence analysis helps us to know the associations among sequence data, and forecast the evolvement directions, it is becoming increasing essential in many domains, such as commercial decision, information security, bilogy gene, scientific computing.With the development of network and other information technology, sequence analysis is faced with new challenges. First, the dataset are huge with lots of imperfect, overflow, noisy data, they are difficut for traditional algorithms to mining rules from them; Second, stream data is generated continuously in a dynamic environment, with huge volume, infinite flow, and fast changing behavior. To analysis data stream, traditional algorithms are also behind the times.Thus, this paper focuses on the research of efficient sequence patterns mining algorithm and sequence forecast algorithms, and real-time methods, intrusion-tolerant intrusion detection methods, error detection methods based on sequence analysis, which has more important theory meaning and utility value for improving performance of intrusion tolerance systems.Sequence pattern mining is an important means of sequence analysis and can be applied many fields. Bayesian approach becomes an important method in data mining because of its high ability on processing imperfect, overflow and noisy data, this paper applies Bayesian approach for sequential pattern mining and put forward the statistics model for sequence data, and more, designs a new mining algorithm for sequence based on Bayesian theory in a noisy environment; this paper also presents a new efficient BMSP-DS algorithm of sequential patterns mining for data stream, which based on extendable sliding window and Bayesian probability filtration., this algorithm can reduce temp data in mining process by eliminating low probability sequence candidates, and quicken frequent sequential patterns mining in limited time and restricted space; One of the most advanced research challenges in sequence stream analysis is to speed up the current analysis process and remove the disturbing from noisy data. A parallel sequence stream analysis algorithm named FTPSA, which is based on proactive fault-tolerant knowledge learning, is proposed in this paper. This paper also discusses some key issues of FTPSA and presents the experiment results. Comparing with other algorithms, FTPSA algorithm is more fault-tolerant, scaleable, accurate, and less memory.Sequence forecasting has become an important branch of sequence analysis. This paper presents a plane regression-based algorithm, called SFA-PR algorithm, to forecast sequence trends for real-time stream data. After gathering real-time stream data through sliding window, algorithm SFA-PR computes support for appointed sequence and describes plane equation to forecast sequence trends in the future. Comparing with other sequence trends mining algorithms, algorithm SFA-PR can cover much more area and never omit key exceptions; A parallel forecast algorithm named MSSF-VQ, which is based on vector quantization, is also proposed in this paper to deal with forecasting evolvement direction in multiple sequential streams. The algorithm utilizes vector vacuum to describe streams, utilizes vector quantization to disperse series streams. Constructing and searching algorithms for vector probability tree are also presented. Comparing with other algorithms, MSSF-VQ algorithm is more accurate, speedy, scaleable and less online memory.As network-based computer systems play increasingly vital roles in modern society, network security has become more and more remarkable. Intrusion Tolerance system (IDS) is one of the most critical techniques to help protect these systems. In recent years, a new approach in IDS has slowly emerged and gained impressive momentum: intrusion tolerance system (ITS), instead of trying to prevent every single intrusion, these are allowed, but tolerated: the system has the means to prevent the intrusion from generating a system failure. Thus, this paper is focus on some key issues of ITS based on sequence analysis algorithms.Real time models that based on sequence analysis and parallel time-series mining are proposed to improve the accuracy and efficiency of the traditional network intrusion detection systems. In these models, multidimensional itemset is constructed to describe network events and sliding window updating algorithm is used to maintain network data stream. Frequent patterns and frequent episodes mining and parallel mining algorithms are applied in the model to implement parallel time-series mining engineer which can generator rules to distinguish intrusions from normal activies intelligently. Analysis and study on the basis of DAWNING3000A indicate that this parallel time-series mining-based model provides a more accurate and efficient way to building real time intrusion detection in ITS.In a noisy environment, an observation may not accurately reflect the underlying intrusion, and this would depress the intrusion-tolerant ability of system. In this paper, we provide a probabilistic connection from the observation to the underlying true value, and present a novel numerical sequence analysis-based anomaly detection algorithm for intrusion tolerance system. This paper also presents a novel intrusion-tolerant intrusion detection method based on real-time sequence forecast analysis for network stream. It devises linear regression techniques to forecast network stream sequences. According to these, it's helpful for us to analysis intruders'behaviors and to recognize undesirable intrusions. This paper also provides recovery strategies to tolerate intrusion. A distributed intrusion detection method named DBSL for ITS, which is based on distributed Bayesian structure learning, is proposed in this paper. DBSL method is not only useful in detecting intrusions with distributed sources, but also can offer hints of intrusion recovery.One of the most advanced research problems in intrusion tolerance system is error detection, which has become another essential technique in system security to prevent the intrusion from generating a system failure. A parallel error detection method named PBL for ITS, which is based on distributed Bayesian learning, is proposed in this paper. This method is particularly suitable for detecting errors with distributed sources. PBL method is not only useful in detecting errors in the distributed network environment, but also can be used to enhance noise tolerant ability of ITS.The researches on algorithms of sequence analysis-based intrusion detection and error detection not only offer a new view and means for intrusion detection, but also enriches the research of data mining.
Keywords/Search Tags:Data Mining, Data Stream, Sequential Patterns, Sequence Forecast, Intrusion Tolerance System, Intrusion Detection, Error Detection
PDF Full Text Request
Related items