Font Size: a A A

Real-time Mining Software Process Activities From SVN Log Event Stream

Posted on:2019-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y C DaiFull Text:PDF
GTID:2428330548473574Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The challenge for Big Data technologies is how to convert the data into the real value.At present,we focus a lot on the storage and processing of Big Data and ignore the process.Process mining technology set up a bridge between the traditional data analysis based on model such as simulation technology,business process management technology and so on and the analysis technology based on data such as machine learning,data mining and so on.It focuses on the process and uses the real data,so it can be used to take the initiative to learn some real human behavioral model.The existing process mining techniques have not yet been directly applied to Software Process Mining.To solve the lack of activity attributes in the log of software process mining,proposed an automated real-time method to extract software process activities.This method extracted each record of events logentry in the SVN log of software process and structured handling its contents through the method of natural language processing,then combined the unsupervised learning and supervised learning in machine learning by means of constructing Na?ve Bayes Classifier to map new activities based on the tag of activities by K-Means clustering.Finally,we assessed the result of classification with Precision rate,Recall rate and the F-Measure.In evaluation,experimented our approach by using the realworld date set of software process logs and compared to previous studies.The result showed that this method can mining software process activities from SVN log event automatically in real time and the average Precision rate,Recall rate and the F-Measure(set parameters to 0.5,1 and 1.5 respectively)reached 0.85,0.87,0.83,0.84,0.85 respectively,proved the effectiveness of this method on activities mining in software process mining.The innovation points in this paper mainly include the following points:(1)In Software Process Mining,from this new perspective of correlation,proposed a method through mapping classification to make the activitiesextraction of software process log,opened up a new way of thinking for researching the process mining from the correlation;(2)Combined the supervised learning with unsupervised learning in Machine Learning to solve the problem of lacking activity information in software process log events,and classify these activities effectively.At the same time,proposed the method that periodical updating the log to update and control the whole activity mining process dynamically and ensure the dynamic real-time performance of the whole activity mining;(3)In unsupervised learning,proposed a method to determine the optimal number of clustering by taking the differential second derivative,solving the problem of determining the optimal cluster number,and introducing a method of calculating a plurality of harmonic averages to accuracy and recall rate in assessing classifier,the evaluation dimension of the classifier is added,so that the evaluation result is more objective.
Keywords/Search Tags:Software Process, Process Log, Activity Extraction, Machine Learning
PDF Full Text Request
Related items