Font Size: a A A

Research And Implementation Of Program Dictionary Construction And Summary Generation Techniques

Posted on:2019-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:W T ZhuFull Text:PDF
GTID:2428330545970001Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid popularization of computer applications and the continuous advancement of information technology,the demand for software products has continuously increased and changed,and the software has grown in size and complexity,and the maintenance cost of software has also continuously increased.Existing studies show that during the software maintenance process,the understanding of the program is the most time-consuming,accounting for about 60 percent of the entire development cycle.The traditional way is to use information retrieval technology to analyze the semantic information in the program code,and then conduct topic extraction operations to help developers understand the program.However,the situation is that multiple independent words can form the topic of expression.Developers still feel puzzled when they understand it.They waste a lot of time and effort in speculating on the meaning of these words in code understanding.This thesis studies program understanding from two aspects,i.e.,code word database construction and code summaries generation.Thereafter,we propose and develop more effective techniques and tools used for program understanding.The main work is focused on the following aspects:(1)This thesis implements automatic construction of the program code word database for specific projects.A prototype tool for the establishment of a thesaurus for the historical code library is given:WB4HPR.WB4HPR can automatically extract the elements in the source code and process them,and store the processed data persistently.In addition,WB4HPR provides developers with a personalized search interface that allows them to retrieve the words they want to understand,the relationships between words,and their evolution in the history database.The effectiveness of the tool is demonstrated through experimental validation and system implementation.(2)This thesis implements program summaries generation based on natural language processing.Taking the natural semantic information in the software program as the input,the latent semantic analysis technology in the information index and the clustering algorithm in the data mining are used to excavate the summary information.In addition,through the semantic repair of the extracted topics,developers can better understand the program.(3)This thesis further aims to generate code change summarizes for the maintenance task.The summarizes are generated including the reasons for changes that are often overlooked by software developers and descriptive information associated with code modifications.Empirical studies show that the code change summarizes can effectively help developers understand a maintenance task.
Keywords/Search Tags:software maintenance, program understanding, code vocabulary, summary
PDF Full Text Request
Related items