Font Size: a A A

Research On Concept Drift Detection And Characterization In Process Mining Based On Heuristic Measure

Posted on:2021-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:Q T WangFull Text:PDF
GTID:2428330611965689Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Process mining is the method of extracting process model from event log.Because the process in online process mining is in an unstable state,the mining process model may change at any time.This phenomenon is called concept drift in process mining.The research of concept drift mainly focuses on the detection and characterization of drift,that is,to determine whether the process drifts and describe the changes of the process before and after the drift.At present,most of the researches focus on the qualitative analysis of activity relationship,which can only confirm the existence or non-existence of activity relationship.Once the judgment of activity relationship is wrong,some normal changes may be misunderstood as concept drift.In addition,the current feature extraction method lacks the consideration of activity and activity relationship frequency,and cannot deal with the noise in business process.For drift representation,the existing methods are insufficient.Only basic relationship structure changes can be described.To solve these problems,a heuristic-based concept drift detection and characterization(HCDDC)framework is proposed.The main contents of the framework are as follows:(1)We propose a heuristic-based process path feature extraction algorithm.This algorithm uses heuristic measures to extract features from the case level.Heuristic measures can quantitatively analyze different activity relationships,and use the size of the value to represent the degree of dependence between activities.The activity relationship dependency value is not easy to mutate,which can solve the problem of false positives of concept drift.In addition,heuristic measures consider the frequency of activity and activity relationship,which can avoid the influence of noise on detection results.(2)We propose a heuristic-based concept drift detection method.The algorithm can get potential drift points for each active pair,improve the probability of concept drift detection,and weigh the phenomenon of false positives and false negatives.(3)In order to better characterize the nature of drift,we defines a variety of relation patterns,and describes the process change through the change of relation patterns.In this framework,we propose a concept drift characterization algorithm based on relation pattern.The algorithm can transform the eigenvalues of heuristic measure into relation pattern and get the drift characterization results.Three experiments are set up to analyze the validity of the concept drift detection method of HCDDC.Firstly,the influence of the size of the sub logs of different process paths on the detection results is analyzed.Then we compare the detection methods of heuristic measure with those of runs measure and ? + measure respectively.When the process path window of heuristic measure is large enough and the feature information can be extracted accurately,the detection accuracy can reach 96%,which is higher than other detection methods.From the comparison experiment of detection precision,we can find that drift detection method of HCDDC has less false positives than other methods.From the experiment of the influence of noise on detection results,it is found that heuristic measure has better anti-noise ability than ? + measure.As for the concept drift characterization experiment,this paper simulates the event log related to the live broadcast business process,and the drift characterization method of HCDDC can describe the changes of key patterns.
Keywords/Search Tags:process mining, concept drift, heuristic measure, drift detection, drift characterization
PDF Full Text Request
Related items