Research On Dynamic Data Stream Classification Based On Bayesian Network

Posted on:2020-09-27

Degree:Master

Type:Thesis

Country:China

Candidate:H M Fan

Full Text:PDF

GTID:2428330596979677

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the coming of the big data era,online data has increased dramatically.Mining of massive data streams in real-time has become a major challenge in the field of machine learning.The online learning method processes large scale data in real-time by updating model incrementally and processing the data item by item,which has received extensive attention from researchers.As an online learning method,Naive Bayes is simple,efficient,and has a solid theoretical foundation.It is used to solve the problem of data stream classification.However,when the concept drift occurs in the data stream,it will seriously affect its classification performance.At the same time,its assumptions of independence for attribute are usually not met in real-world applications.Based on the above problems,this paper makes improvement research on the basis of Naive Bayesian algorithm:(1)In order to solve the problem that the dimension of feature space is too high in classification,and the assumption of independence for attribute in Naive Bayesian algorithm is insufficient.This paper proposes an information theory-based classification framework for attribute selection.By analyzing the correlation properties between Jeffreys divergence and type? and type ?.errors of bayesian classifier,aiming at the limitations of Jeffreys divergence under multivariate distribution,the multi-Jeffreys-Hypoth-sis(MJH)was introduced to measure the multivariate distribution differences,and a selective Naive Bayesian classification algorithm based on MJH was proposed.Experimental results show that the algorithm has good classification effect and convergence.(2)Naive Bayesian classifier has no mechanism to detect and handle concept drift,and cannot handle streaming data classification under non-stationary conditions.This paper proposes a weighted naive Bayesian algorithm based on forgetting mechanism.The weighting of the instance is carried out by the forgetting mechanism,and the weight is gradually attenuated over time,so that the original naive Bayes classifier can automatically and quickly adapt to the data change,and achieve the purpose of solving the concept drift problem.Experimental results show the effectiveness of the algorithm.(3)In the presence of concept drift,based on the assumption that historical knowledge and current knowledge are related,analyse the advantages of integrated learning method,this paper proposes an integrated learning algorithm based on knowledge transfer.Through the pattern of knowledge transfer,while extracting the useful knowledge in the historical model,the knowledge that is different from the latest data distribution is removed,so a new historical model is obtained.Weighted and merged the migrated historical model with the latest data derived model.The experimental results on simulation and real data show that the integrated learning algorithm based on knowledge migration can fully utilize the advantages of integrated learning and effectively solve the problem of concept drift in data stream classification.

Keywords/Search Tags:

Bayesian Network, Attribute Selection, Concept Drift, Forgetting Mechanism, Knowledge Transfer

PDF Full Text Request

Related items

1	Condition-induced Concept Drift Detection And Optimizing Selection Of Attribute Reducts
2	Research On Classification Algorithm Of Concept Drift Data Stream Based On Online Transfer Learning
3	Research On Bayesian Network Based Approach For Data Mining
4	The Research Of One-dependence Bayesian Model Based On Attribute Selection
5	Research On Concept Drift Type Detection And Accelerated Convergence Of Streaming Data
6	Research On Detection And Adaptive For Mixed Types Concept Drift
7	Research On Competence Model-Based Adaptive Learning Techniques For Handling Concept Drift
8	The Construction And Application Of Concept-Attribute Knowledge Network Based On Digital Books
9	The Study Of Selective Adaptive Ensemble Learning Method For Concept Drift Problem
10	Research And Application Of Naive Bayesian Classification Based On Attribute Selection