Detecting partial drifts using a rule induction framework | | Posted on:2011-10-14 | Degree:M.Sc | Type:Thesis | | University:York University (Canada) | Candidate:Sotoudeh-Hosseini, Damon | Full Text:PDF | | GTID:2448390002460552 | Subject:Computer Science | | Abstract/Summary: | PDF Full Text Request | | The major challenge in mining data streams is the issue of concept drift, the tendency of the underlying data generation process to change over time. Due to drifts in the concept, a classification algorithm that was learned from the earlier part of the stream loses its accuracy on the new instances. It is thus important to find efficient methods of updating the classification model in order to maintain high performance.;Stream classification algorithms commonly process the most recent instances in order to adjust the classification model. We argue that the old instances are not necessarily the least relevant instances to the emerging concept, and propose a method that aims to provide the classification algorithm with the set of most relevant instances to the emerging concept, and not necessarily the most recent ones. Learning a new concept from a larger set of instances reduces the variance of data distribution and allows for a more accurate, stable, and robust classification model.;Our experiments show that our proposed approach can successfully detect a variety of concept drifts. Our framework can also offer higher classification accuracy compared to other approaches on a variety of synthetic and real data sets. Furthermore, our proposed approach is able to process the stream very efficiently, both in terms of execution time and memory consumption.;In this paper, we propose a general rule learning framework that can efficiently handle concept-drifting data streams and maintain a highly accurate classification model. The main idea is to focus on partial drifts by allowing individual rules to monitor the stream and detect if there is a drift in the regions they cover. For each rule that is affected by a drift, a rule quality measure decides if it has become inconsistent with the emerging concept. The model is accordingly updated to remove the inconsistent rules and only include rules that are consistent with the newly arrived concept. | | Keywords/Search Tags: | Concept, Rule, Drift, Data, Classification model, Stream | PDF Full Text Request | Related items |
| |
|