| Code comments use natural language to discuss the logic or functions behind the code.Developers use code comments to understand the functions of source code description in the software warehouse.Reading and understanding the source code of the open source community during the software development process is a time-consuming and labor-intensive task.If the source code has corresponding functional descriptive comments,it greatly simplifies and speeds up the software development process and speed,while improving code comments can also improve the maintainability of the software system.Processing the comment information in the software warehouse and mapping it to similar code fragments in the target project software is called Comment Reuse.Program Parsing is to start with the code and the comment itself in the process of optimizing comments,and analyze the meaning of the code.But at present,less than 20% of the code has corresponding comments.If developers manually add comments to the source code,it takes time and effort.This thesis conducts research on the problem of the low number and low quality of automatically generated code comments to provide researchers with valuable and meaningful information and data.The specific work is as follows:1.Obtain clone detection results based on Nicad and extract codes and comments.The key to comment reuse is to find similar code between the software warehouse and the target project software,and then extract it.In this thesis,the clone detection tool Nicad is used to detect similar codes.The clone detection results are used to analyze the codes and comments.The clone codes and corresponding comments are extracted from the corresponding positions in the software warehouse to construct a candidate list to provide basic data for subsequent research.2.On the basis of the obtained clone code and code commentcandidate list,streamline and optimize clone code and code comments to obtain high-quality clone code and code comments.The clone code and code comments initially obtained have problems with clone code redundancy,irregular comment format,and mismatch between comment content and code.The clone code and code comments are streamlined and optimized through a series of heuristic rules and program parsing methods.3.Obtain high-quality codes and comments,and perform "code-comment" mapping.The obtained code and comments may have a one-to-many code and comment.By calculating the similarity score of the code segment node and the comment node,sorting them,selecting the code comment with the highest score to map,the final document is obtained.4.Multi-dimensional quality assessment of the obtained code and comment mapping document.Code comments automatically generated by manual evaluation provide a comment questionnaire for those with Java programming experience;the code comments generated by the baseline control evaluation are compared with Clocom,a code comment generation tool that has been proposed in the field.Experiments show that the method of combining comment reuse and program parsing in this thesis increases the average number of each software by 12%,and the average quality of each software increases by 5%,and research shows that the vast majority of participants believe that automatically generated comments are accurate and effective in helping them understand the code. |