Font Size: a A A

String Matching Algorithm Design And Implementation, Based On The Hierarchical Classification Of Web Content Monitoring System

Posted on:2005-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZhangFull Text:PDF
GTID:2208360125455645Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Network Content Monitor has become more and more important just because there is illeagl information about subversion, violence, obscenity etc. found on the network. In this paper, we analyse the current development of Network Content Monitor and introduce a method of hiberarchy classification for it. This method first filters the network information by keywords filter, and then with the help of semantic analysis analyses those information, which contain the keyword(s) given by the user. It can ease the pressure of semantic analysis, improve the efficiency of the system and adapt well to the caprice of illeagl information.Keywords filter is realized by string matching algorithm and is the performance bottleneck of Network Content Monitor indeed. Therefore, this paper analyses the existing algorithms of string matching including single-pattern string matching and multi-pattern string matching. Then we design and implement two algorithms suitable to Chinese character set and Network Content Monitor: one is IQS, a single-pattern string matching algorithm based on QS algorithm, and the other is IWM, a multi-pattern string matching algorithm based on Wu-Manber algorithm. The aims of the two algorithms mentioned above are to search a keyword among files saved in local hard disks and to filter the network information, respectively. Meanwhile, we make experiments on IQS algorithm and IWM algorithm under the circumstance of Chinese and English, and different lengths and numbers of patterns. The experiments' results indicate that the two algorithms have a better performance on speed and Attempt.Finally, we integrate IQS algorithm and IWM algorithm into the system of Network Content Monitor, and make experiments on this system. The results show that the high speed of string matching algorithms is achieved, and the performance targets of this system are also obtained.
Keywords/Search Tags:Network Content Monitor, string matching, single-pattern, multi-pattern
PDF Full Text Request
Related items