Font Size: a A A

Evaluation Model Design And Implementation Of Data Stream Classification Technology

Posted on:2017-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:X S WanFull Text:PDF
GTID:2308330485458150Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of Internet data, various types of data have been generated in the form of stream. Static data mining technologies have been unable to meet the needs of the real problems. It has emerged a variety of data stream mining technologies, and the evaluation model of data stream mining technologies has become a hot research topic.This paper first introduces the data stream mining system MOA (Online Analysis Massive) platform, which has implemented the data stream mining technologies, including classification algorithms, clustering algorithms, and concept of drift detection methods. Then, this paper starts with classification problem, introduces the evaluation model of data stream classification technologies, and focus on the various evaluation strategies and evaluation metrics. On the basis of this, two evaluation models are proposed, namely, the balanced accuracy evaluation model (BalancedAccuracy), and the AUC (Under ROC Curve Area) evaluation model (AUC2Stream). Finally, use the MOA platform, data stream generator and real world data stream to carry on related experiments. Using this two evaluation models to evaluate different stream classification algorithms, draws the corresponding index. Using statistic and visualization analysis tool to analyzes and compares different data stream classification algorithms, and analyzes the underlying reasons in some advantages and disadvantages. Experimental results show that, BalancedAccuracy evaluation model due to considering the relationship between accuracy and resource consumption can be more balanced use accuracy to evaluate the data stream classification algorithms, when computing resources (computing time and memory) constraints, but still need to get a higher accuracy, using BalancedAccuracy to evaluate the data stream classification algorithms is more accurate. In data stream environment AUC2Stream evaluation model can deal with multi class values, when in this problem and need to get a higher recall rate and lower False Positive rate, using AUC2Stream to evaluate the data stream classification algorithms is more accurate The evaluation models proposed in this paper provide a more adequate tHoeretical basis for guiding the use of data stream classification algorithms.
Keywords/Search Tags:data stream mining, classification technologies, evaluation models, evaluation metrics
PDF Full Text Request
Related items