Font Size: a A A

Research On Concept Drift Detection Algorithms For Data Streams

Posted on:2017-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:J H HaoFull Text:PDF
GTID:2348330512450936Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,huge amounts ofdata are created in network monitoring,telecommunications datamanagement,and financial applications.etc.Data streams which havecharacteristics of high-speed,continuity,levity and openness not only bringchallenges to data storage but also make it difficult to extract knowledge.Fora data stream,the data dynamically change over time,so its hidden conceptscan be unpredictable,which is called concept drift.It contains spam detection,financial fraud detection,weather change forecast,customer preferences foronline shopping.This study is very meaningful.For concept drift detection,most of the existing works take methods likerule-based systems,decision trees,Naive Bayes,support vector machines.etc.These methods measure the probability distribution and classificationaccuracy to detect drift.There is still room for improvement of conceptdescription and classification accuracy.Based on this,our work is focused onthe concept drift detection,and main contributions are as follows:(1)Some concept drift detection problems in data streams are firstsummarized.The related works are reviewed and analyzed,breaks and targetsare proposed.(2)The performance of drift detection depends on the conceptdescription.The existing algorithms utilize sliding window mechanism todetect concept drift.It is ordered between windows,while it is unordered insingle window so that it loses some time information.Based on this,aconcept drift detection algorithm based on sequence alignment(namedCDD BSA)is proposed in this thesis.This algorithm utilizes sliding windowmechanism and improves Needleman-Wunsch algorithm to measure thesimilarity between windows.Our extensive experiments prove that theperformance on the similarity measurement and robustness in CDD BSA isimproved compared to several state-of-the-art algorithms.(3)Concept drift detection algorithms based on single sliding window mechanism are greatly influenced by the window size.To overcome the weakness of single-window-based mechanism,a concept drift detection algorithm based on dynamic windows(named CDD BDW)is proposed.The Naive Bayes approach is utilized for classification and the Boosting algorithm is utilized to hybrid integrate classifiers.The window size is adjusted dynamically to the drift data streams.Ensemble classifier is updated dynamically to ensure the classification accuracy.Our experiments demonstrate that the algorithm performance on the classification accuracy and concept drift detection is improved compared to several state-of-the-art algorithms.
Keywords/Search Tags:Data streams, Concept drift, Sequence alignment, Dynamic window
PDF Full Text Request
Related items