| With the rapid development of Internet and software development technologies,the number of legacy systems and free software accumulated has increased.The lack of relevant documents make it so difficult to understand the structure of software that there are difficulties for the development and maintenance of software.In order to help people understand the structure of the software,it is necessary to make structural analysis to obtain a software structure model that demonstrates the calling relationship between the various methods in the program.Nowadays,UML diagrams have become the de facto standard for describing software,but UML diagrams have no precise semantics,so they are not directly usable for model-based techniques,and do not support performance analysis.When using the process mining to analyze the event log that records the running information of the software,the process mining can be applied to the structural analysis,after then the structural model of the software can be obtained,and the model can be directly used for model-based analysis because of its precise semantics(eg.software performance analysis).When applying process mining technology for structural analysis,the input of the process mining algorithm is an event log with events which are grouped by running instance(abbreviated as instance).However,because the software processes multiple userrequests in parallel,events of multiple instances are recorded in the original log in a messy manner,and the event is missing the instance ID,which makes the events unable to be grouped by instance,so the original log need to be instantiated so that the events can be grouped by instance,after that it can be used in current process mining algorithms and tools.The key to grouping events by instance is the interrelationship between events.However,because the software uses multiple threads and multiple processes to handle each user-request,the problem of event correlation in instantiation depends on thread association problems and process association problems.In addition,due to the reuse of threads and communication resources,there are problems with inaccurate associations,so the original log instantiation is more challenging.In order to solve the above problems,this paper conducts in-depth research and analysis on the original log instantiation problem.Based on the correlation between events,threads and processes in the original log,an instantiation algorithm is proposed to solve the original log instantiation problem.The specific work of this article is as follows:(1)Formalization defines the original log,which makes the original log contains valuable running information for instantiation and structural analysis.Based on the reflection implant technology,a original log collection tool is implemented.(2)For the event association problem in the same thread,based on the calling relationship between the methods and the inclusion relationship of the method running time interval,the calling association rule is defined to associate two events belonging to the same instance.(3)For the thread association problem(actually thread association issues in the same process),based on the parent-child relationship between the threads and the execution order of the thread,a thread association rule is defined for associating two threads belonging to the same instance.(4)For the process association problem(actually thread association issues in different processes),based on the communication between threads and the time interval in which the thread occupies communication resources,a process association rule is defined for associating two threads belonging to the same instance but in different processes.(5)Based on the above three association rules,an original log instantiation algorithm is implemented,which can make the events in the original log grouped by instance: firstly,the calling association rule is used to aggregate the events belonging to the same instance in the same thread.Then use thread association rules to aggregate events belonging to the same instance in different threads;finally use the process association rules to aggregate events belonging to the same instance in different processes to get the event log.Multi-threaded experiments and Multi-process experiments were used to evaluate the feasibility of instantiation algorithms and the impact of log collection on software performance: in two experiments,compared with the existing algorithms,the instantiation success rates are as high as 80% and 92% respectively,indicating that the instantiation algorithm in this paper can solve the original log instantiation problem for structural analysis;in the experiment,by comparing the running time of the original software and the software with the monitoring code,it is proved that the log collection method based on reflection implant has little influence on the software time cost.Finally,the event log from instantiation is used for process mining to perform structural analysis and performance analysis,which shows that the work of this paper is meaningful for software analysis. |