Font Size: a A A

Research On A Logging Statement Level Recommendation Method Based On Machine Learning Technology

Posted on:2020-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:B B XueFull Text:PDF
GTID:2428330575955089Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Because of the ability to capture and record the system runtime information,log-ging statement has become the main source of information to analyze the causes of problems when the software system encounters failures.On the other hand,the rapid changes in the field of Internet have brought more and more users and rich functional requirements,which has led to the ever-increasing quality and performance require-ments of software systems.For the above reasons,logging statements attract more and more attention from practitioners and researchers.In fact,properly inserting logging statements into software code has become a very important part of developers'daily work.A logging statement uses static text and optional related variables to record critical event information in the system.When writing logging statements,developers need to decide where to log and what to log.However,it is not enough to consider only these two aspects.Existing logging frameworks and tools require that each logging statement be assigned a log level to describe the verbose level of the log information,which will affect the ultimate saved logs.If a logging statement is assigned an inappropriate level,it may lead to the consequence that the information that should be recorded is not stored,which makes the subsequent log analysis work missing critical information.Existing studies have shown that because of the need to weigh the benefits and costs of a large number of logs with sufficient content,developers tend to spend more energy assigning log level to a logging statement,and they often rely on their own development experience and domain knowledge for decision-making.It has become an urgent and important task in academia to provide effective guidelines for developers to logging statements.Therefore,this thesis proposes a method of recommending log level for developers based on machine learning technology.It has been found that when determining the log level of a newly added logging statement,the information provided by the code block and file containing this logging statement plays the most important role.So in this thesis,the digital text features obtained from the processing of digital features,boolean features and text features extracted from containing code block and file are used as the input of the algorithm model,and the appropriate level prediction of the newly added logging statement can be obtained.The training data of the learning model comes from the top 100 Java projects on GitHub,which have good log practices and reliable data quality,cover a variety of product types,run for a long time.This thesis not only chooses three traditional text categorization algorithms:de-cision tree,support vector machine and logistic regression model,but also uses convo-lutional neural network in the field of deep learning to construct the classifier model.Through feature learning of logging statements in the top 100 Java projects on GitHub,the performance evaluation results(AUC and BrierScore)of the four classifier models are excellent,and also show better performance on data sets of approximate study.The experimental results of random sampling data also prove that the proposed method has strong stability and wide applicability.
Keywords/Search Tags:logging statements, machine learning, text categorization, log level
PDF Full Text Request
Related items