Font Size: a A A

Research On Intrusion Detection System Orientated String Matching Optimization

Posted on:2015-08-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:T L YangFull Text:PDF
GTID:1228330422492409Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Internet has become so popular that it can not be taken away from our daily life. With the development of internet, technique of network crime also becomes popular and diversified. As a result nowadays computers have more chances to become victim nodes in network because of the network attacks hidden behind some place. Hackers might also employ some social means to intrude personal computer to get something they are interested. Even some victims are used to cooperate with other victim computers so that they can make more serious attacks. So it is vital to develop a strong security architecture to guard against potential and to stop ongoing attack. Intrusion detection system (IDS) is a necessary part of a security architecture. In a IDS the software reads data from network and classifies the data as the harmless or the harmful according to some rules or characters. After the classification by IDS, further analysis can be made upon the harmful data so that we can know which attacks are related to each other and where these attacks come from from. To make IDS run effectively effective string matching algorithm is needed. The problem is that the diversified attacks enlarge the patterns set (which is used to recognize harmful network data) of IDS so that the enlarged size impacts the performance of IDS.In this paper we mainly focus on the technique to support large scale patterns set matching and the technique for high performance matching. The points of this paper will be detailed as following s:(1) A computer system will fail to construct the matching automaton due to de-manding too much memory when the amount of patterns is too large. To make matching automaton from large patterns amount can be loaded into memory successfully, a hybrid matching structure is studied. In this structure patterns are classified according to their lengths so that the original matching automaton can be reorganized compactly. Text can be matched by two different matching technique using different matching automatons. To make the hybrid matching structure run as effectively as possible, two kinds of tex-t filtration are employed:state-symbol reversed mapping (SLSPM) and two layer AVL filtration (BCHDFA). In the state-symbol reversed mapping, by first symbol-reading and then state-comparison method, looking for pattern block is replaced by looking for state number. While in the two layer AVL filtration, two specific positions in the substring of a text are selected and two substrings with a length of2from the two positions are compared with the corresponding substrings of some pattern blocks to produce possible filtration. Both of these two kinds of filtration techniques reduce the searching time.(2)To make long pattern, such as URL, matching more effectively, two SSE instruc-tions based algorithms SSEHash and RTRIE are proposed. Both SSEHash and RTRIE are only suitable when the pattern length is not shorter than16. In SSEHash,128bits data (168-bits characters) can be distributed into units by pseudo hash function composed of two rounds of SSE instruction PHADDW and24bits modular arithmetic. With the help of SSE instruction, our pseudo hash function runs faster than real hash function, so that SSEHash can be used to filter out unnecessary text block more efficiently. As a result of this kind of effective filtration, SSEHash supplies better performance when the short-est length is longer. In Wu-Manber, the text pointer moving distance often gets shorter when the patterns amount increases. To solve this problem the algorithm RTRIE employ-ing reversed trie tree is designed. All possible suffixes of the prefixes with a length of16for all patterns are selected, and one specific bit for each character in the suffixes are selected to make fingerprints. The reversed trie tree is composed of these fingerprints, and safe text pointer moving distance is computed for each fingerprint. RTRIE performs better than Wu-Manber because:The degeneration of the moving distance in RTRIE can be restrained better when the patterns amount increases than that in Wu-Manber because that the moving distance in RTRIE is computed from more than two or three characters compared with that computed in Wu-Manber.(3)To improve the performance of the expression matching KPGEM and MaskVeri are proposed to delete unnecessary expressions and reduce expression matching time. Be-fore deleting unnecessary expressions all the possibilities regarding expression inclusion are discussed. In an expression, the relationship of successive and preceding is defined by both the short keys appearing position in the expression and the short keys length to prevent losing possible combination of key (or pattern) as another expression. A graph of keys path for one expression is generated from all the short keys belonging to the same expression according to the relationship of successive and preceding. If any expression can be found in some sub-path from the keys path graph of the original expression, the original expression represented by the graph can be deleted from the expressions set safe-ly.(4)String matching algorithm is the vital technique used in the network data inspec-tion of the IDS. Based on the alerts from the element of network data inspection complex attack scenarios can be extracted. At the end of the thesis a simple IDS based on the proposed string matching technique in the former part of the thesis is designed. In or-der to decrease the amount of the alerts waiting for future analysis (sequence analysis), a trick named semantics reserving was proposed. This trick can reserve the whole useful alert sequences when it deletes the repeated sequences in the alerts set. In our opinion the sequence analysis based technique gives a macroscopic view of the attack alerts and the capability transition mode alerts correlating shows a more detailed relationship be-tween alerts in microcosmic view. After data fusion by D-S based on sequence analysis and capability transition mode alerts correlating, the result was fed into a function named belief-energy. Comparing belief-energy, a sequence which has the largest belief-energy can be automatically selected as the most attack liable one. Our intrusion detection system can both hold large amount of patterns and extract attack scenarios accurately.
Keywords/Search Tags:Net security, Intrusion detection system, String matching algorithm, Opti-mization with SSE instruction, Attack alerts correlating
PDF Full Text Request
Related items