Font Size: a A A

Log Enhancement For Large-Scale Open-Source Software

Posted on:2016-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y JiaFull Text:PDF
GTID:2348330536967728Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With software scaling up continuously,logging mechanism has become an indispensable part in the failure diagnosis area,since a pretty similar symptom might be caused by various software bugs,and the most obvious evidence is always logging messages.Meanwhile,most pieces of large-scale software are developed by multiple person,accounting for the fact that logs are mostly written casually depending upon their personalities,instead of being guided by certain conventional specification.Currently,although tremendous attention has already been paid to automated log management,the existing solutions only comply with certain code patterns.Semantic bugs gradually dominate as software becomes mature;these bugs,however,cannot be well handled by simplysummarized code pattern.In this paper,we design and implement SmartLog,an automated log-inserting tool,which is capable of learning program-specific logs,and finding out error-prone locations by statistic information of logging behavior.Our work includes:1.We characterize system logs in five pieces of widely-used software such as MySQL,Subversion,Apache Http,PostgreSQL and Wireshark,and find six log-related observations including log behavior can be affected by context,semantic bugs gradually dominate,multi-developer of software results in file-specific log style,the wide use of errorreturn code influences system log,developers have not always logged for error-prone program point and test module has an effect on log density.2.SmartLog proposes a machine learning method to recognize logging functions automatically,releasing the limitation of existing log tools.Through reasonable method on feature extract and filter,the ability of recognition logging function is 76 X than keyword method,and the F-score reaches 0.93.3.During the log enhancement process,SmartLog recognizes logged snippets based on logging model,having a significant accuracy boost compared with existing method.Sample test shows the false positive rate and false negative rate are 4% and 13%,respectively.Additionally,SmartLog proposes binary checking tree to determine the semantic equivalence of different logging context.The evaluation illustrates that BCT has a decent scalability as well as an high recognition accuracy reaching 97%.Based on the statistics of logging times under equivalent context,SmartLog enhances system log automatically with the consider of performance overhead,code readability and maintainability.Based on little-weight statistic analysis and machine learning method,the complexity of SmartLog scales linearly with lines of code,and costs about 45 seconds per million lines.SmartLog adds 5% additional logs compared with existing logs,and contributes less than 1% performance overhead at the same time.The validity evaluation shows that86% of new logs are considered error-prone by evidences from developers.
Keywords/Search Tags:large-scale software, log enhancement, code quality, machine learning, static analysis
PDF Full Text Request
Related items