Font Size: a A A

Sudden Drift Detection Of Behavior Process Based On Software Execution Event Stream

Posted on:2021-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:J Y YuanFull Text:PDF
GTID:2518306230478304Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Mining and improving software process is an important method to ensure the quality of software products and maintain the operation of software products.In the field of process mining,log,model and process are three key entities.The purpose of the drift detection algorithm is to detect the evolution of the model by detecting the evolution of log data and then improve the process.In terms of data,on the one hand,with the development of open source platforms like Github,a large number of log data developed by software projects are available for us to study and mine;on the other hand,the extensive documentation of software execution log makes it possible to discover and dynamically analyze the behavior of the software runtime.This paper proposes a drift detection method based on the change of activity distance,which is used to solve the problem of sudden drift detection of software process event stream.It can be divided into the following processes:(1)Extraction of activities: For the data of the software development process,there is no activity attribute,so each submission instruction needs to be mapped to the activity first;Some existing mapping methods have the disadvantage of defining the mapping rules and the number of mapping activities is small and fixed.Therefore,a method of extracting activities is proposed by using LDA topic model.The number of activities can be freely selected and the rules are supported by algorithms,which is more reasonable.(2)Window selection: The problem of drift detection of the size of the sliding window and moving several steps at a time has been discussed and explored.Currently,the better solution is based on the method of trace minimization and trace independence based on multiple instances.Based on the activity relation,this paper proposes the method of detecting the growth point with the lowest frequency to determine the size of the window,and the sliding window only needs to ensure that between adjacent Windows,the tail activity of one window is the first activity of the next window,so as to ensure that the relationship is not lost.(3)The proposed drift detection framework: After the activity sequence and window size are available,the drifting detection of the ever-growing single instance sequence is carried out with the activity as the research object.A framework isproposed,which is divided into three steps: extraction of the relation matrix,transformation of the relation matrix into the distance matrix,and detection of the change of distance distribution.Direct follow-up relation,jaccard distance and KL divergence are used respectively.Finally,the detection result of distance distribution change is the result of drift detection.The above method is evaluated by means of several groups of simulated data and real data,and the experimental results show that the proposed method can effectively detect the drift points of the software process event stream.The innovation of this paper lies in the fact that the existing drift detection algorithm is not applicable to the log form of the software process event stream.Meanwhile,for the selection of window size,this paper proposes to determine the window size by finding the growth point of the lowest frequency of activity relation.The significance of this paper is to discover the phenomenon of concept drift in the sequence of software process events,so as to better mine the process model that is more in line with the behavior pattern of software process,so as to better guarantee and maintain the quality of software products.
Keywords/Search Tags:Software behavior process, Concept drift, Event stream, LDA, Distance
PDF Full Text Request
Related items