Font Size: a A A

Research On Data Classification Based On Rough Set And Neural Network

Posted on:2009-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y P ZhangFull Text:PDF
GTID:2178360242467498Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of information techniques and database technology, people can easily access and store massive data. However, it is difficult to extract very valuable information from large amount of data. As an important data analysis technology, data classification can be used to extract models that describe significant data types and predict the trend of future data, helping people get rid of the predicament of "data rich but information poor". Intrusion detection and text classification essentially belong to data classification. Intrusion detection is to determine which type the behavior belongs to according to the characteristics of host or network data, while text classification is classifying documents according to their features.Rough set theory is a mathematical tool which can well process imprecise, incomplete and uncertain data. It can effectively eliminate redundant information through attributes reduction and extract classification rules. Neural network as classifier is of high accuracy and robustness. Therefore, they have been widely used in data classification in recent years. However, classification such as intrusion detection and text classification cannot be capable of the massive and high-dimensional data. The classification methods based on rough set theory have poor fault-tolerance and weak generalization ability while neural network classifiers has some defects such as complex network structures, too long training time and so on. How to combine rough set and neural network effectively for intrusion detection and text classification is the main research content of this paper.Aiming at the problems that intrusion detection data is high dimensional, redundant and noisy, a novel hierarchical intrusion detection model based on rough set and neural network is proposed. The model utilizes rough set in pretreatment to reduce data and constructs a hierarchical classifier with many neural networks to extricate the dilemma between stability and plasticity. Two key technologies of text classification are discussed: weight calculation and feature extraction. Feature filtering based on distribution discrepancy of class character-words is used for text preprocessing. The frequency of documents belong the same class is introduced into to modify the formula TFIDF. In order to further select features which contribute much to classification, VPRS-oriented feature selection is put forth and realized by SQL; and RBF neural network as text classifier. Experimental results show that the organic combination between rough set and neural network can be applied in data classification effectively.
Keywords/Search Tags:Rough Set, Neural Network, Intrusion Detection, Text Classification
PDF Full Text Request
Related items