Font Size: a A A

Research And Application On Data-stream Outlier Data Mining

Posted on:2008-12-27Degree:MasterType:Thesis
Country:ChinaCandidate:C WangFull Text:PDF
GTID:2178360242960611Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Outlier data should be treated seriously, from which people can discover some real and unexpected knowledge. Data-stream consists of a series of ordinal coming, boundless, dynamic data. Outlier data mining in data-stream is a new task of data mining, which has been broadly applied in daily life. At present, data-stream mining is becoming a hot topic in the domains of database, machine learn and statistics, and a useful tool in many research fields. When the data model of data-stream is broadly used in individual and commercial information, some existing application software need to analyze and deal with these fleetly changing data. But the limitation of existing data-stream system and the one-pass character of data-stream lead that it is hardly to mine useful information effectively, and to deal with it more from huge data-stream. The disadvantages of traditional data mining algorithms in mining data-stream is indicated by many researchers, meanwhile, these disadvantages also promote the researches of improving of existing data mining algorithms and creating new data-stream mining algorithms.This thesis is divided into six chapters. The first chapter, Foreword, briefly introduces the basic conceptions, theories, and some characters of mining technology, etc. The second chapter, Summary of Outlier Data Mining, is about the presentation of outlier data mining and outlier data mining methods in common use. The third chapter, Data-stream Clustering Analysis, indicates main data-stream clustering methods and the close relationship between data-stream clustering and data-stream outlier data mining. The fourth chapter presents a reverse k nearest neighbor (RkNN) based distributed data-stream outlier data mining algorithm. This thesis exploits the frame of "CluStream" algorithm, designs an algorithm based "CluStream" for data-stream outlier data mining, and extends the algorithm to distributed data-stream environment. At last of this chapter experimental process and results are given. The fifth chapter designs an agricultural weather disasters real-time forecast system. The researches of former chapters about data-stream outlier data mining are used in agricultural weather fields, and to frame this system. This chapter detailedly analyses the system structure and system flow. The last chapter is a summary about this thesis and a prospect of future study.
Keywords/Search Tags:Data-stream, Outlier Data Mining, Clustering, Distributed Data Mining
PDF Full Text Request
Related items