Font Size: a A A

The Method Of Python Documentation Defect Detection Based On Static Program Analysis

Posted on:2021-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:S Y LiangFull Text:PDF
GTID:2428330647951035Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Modern software development usually consists of many personal contributions,and relies heavily on reusable software components,such as libraries and frameworks,to improve efficiency.In order to use other people's code correctly,developers need to understand the documentation and source code to be familiar with the necessary knowledge.As an important carrier of software system domain knowledge,the correctness and reliability of documents directly affect the quality of software engineering tasks including program understanding and system maintenance.Due to the rapid iteration and evolution of modern software systems,documents are easily outdated or contain error information.However,due to the complex system of documents and source code,it is very time-consuming and error-prone to find defects manually.Therefore,it is of great practical significance to automatically detect defects in Python documents and improve the accuracy and maintainability of system documents.The main research content of this paper is a method of Python documentation defect detection based on static program analysis.Excellent Python documents usually contain information such as parameter description,design principle,example code,etc.We focus on the above three aspects,namely example code error detection,parameter constraint consistency detection and linguistic antipatterns detection,which covers the three most important aspects of user understanding in API documents.In this thesis,the source code package of open-source Python framework is parsed into an abstract syntax tree,and the Python documents are statically analyzed,so as to automatically detect the defects in Python documents.The main methods are: extracting code examples from the comments of the source code package,analyzing the code names mentioned in the examples,and detecting whether they are predefined;extracting the parameter exception specifications of the methods in the source code and the related parameter usage constraints in the comments,and detecting whether there are inconsistencies;extracting open methods from the source code package,and analyzing linguistic antipatterns,finally unified output document defects.In this thesis,experiments are carried out on ten frameworks including Num Py,Sci Py and Sklearn,including more than 4000 files and more than 1.7 million lines of Python code.Among them,the input content is the source code package of the framework,and the output content is the detected three types of document defects.We have detected 145 code example defects,464 constraint condition defects and 324 linguistic antipatterns defects.Through manual inspection,we can achieve 68% accuracy and75% recall rate,which greatly improves the efficiency of detecting document defects.
Keywords/Search Tags:Software testing, static analysis, document defect detection, automatic detection
PDF Full Text Request
Related items