Font Size: a A A

The Research On Classification Algorithms Over Data Stream

Posted on:2010-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:F X LiFull Text:PDF
GTID:2178360275458667Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The high-speed progress of communications,computers and networks take the people into the information society.The scale of the data is inflating so quickly.They take the convenience into the life,and make people fuzzy.Data mining is the new technique to help people get the destination.It is to extract useful information from massive volumes of data. Recently a new data Processing model,known as data stream,has arisen.Example applications include financial tickers,network traffic monitoring,web and transaction log analysis,and sensor networks.This Paper exploits a new classification method called EVFDT,and an ensemble classifiers system.We designed some experiment to validate the precision and time-efficiency of EVFDT and ensemble classifiers.The main work is listed as follows:â…°.We exploit a method called Uneven Interval Pruning to make the algorithm has a ability to split numerical attributes.This method has good performances.â…±.Use Naive Bayes classifier to deal with the inner nodes and leaf nodes.This method reduced the sample space when training the decision tree.â…².Proposed an ensemble classifiers and a detection method for concept-drifting to deal with the concept-drifting data stream.â…³.We propose a weighted method based on the precision from principle of locality, and a pruning method by the weight.â…´.Design a series experiments to validate the performance of the EVFDT algorithm and the ensemble classifiers.The results show the EVFDT has high precision and time efficiency,and ensemble classifiers have good performance on concept-drifting data stream.
Keywords/Search Tags:Data Mining, Data Stream, Classification, Uneven Interval Pruning, Naive Bayes Classifiers
PDF Full Text Request
Related items