Font Size: a A A

Towards Extracting Implicated Semantics From Artifcacts To Improve IR--Based Automated Traceability Recovery

Posted on:2022-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2518306725984919Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Software traceability aims to create and maintain relations between various soft-ware artifacts(e.g.,requirements and code)during the software development process,helping software developers and maintainers to determine the scope and impact of changes in the functionality of software artifacts by correlating the relationships be-tween artifacts at different levels,and reducing the maintenance cost of software sys-tems.However,manually establishing requirements-to-code traceability in a rapidly iterating software system is time-consuming and error-prone.Therefore,how to auto-mate the creation of traceability between artifacts has become a hot and difficult area of research.The current automated software traceability creation is mainly based on informa-tion retrieval methods.This method calculates the similarity between requirements and code text,and generates a list of candidate traceability links in reverse order of values for users to select and decide.Unfortunately,current traceability methods are highly dependent on the quality of embedded text in software artifacts,and the lexical mis-match between different artifacts hinders the performance of these methods.In order to improve the accuracy of traceability links,a great many research works in the field have proposed a series of enhancement strategies,which are mainly classified into two categories based on code dependencies and using user feedback.Most of the IR-based enhancement strategies have not yet considered the dependencies of the requirement text structure and ignored the importance of the two dimensions of artifact text quality and implicated semantics for software maintenance activities,resulting in poor accu-racy of the analysis methods.Based on these investigations,in order to generate a higher accuracy candidate traceability link list of requirements-to-code trace relations based on the existing meth-ods,we form the following important research ideas:(1)recover and improve the qual-ity of code text used for method input by extending the acronyms in the source code?(2)extract implicated semantics in the artifact text to explore the potential trace re-lations between requirements and code,and use the requirements text structure to set optimization strategies,thus improving the accuracy of existing methods.The work in this paper is summarized as follows.1.We proposed and implemented an abbreviation expansion-based approach to opti-mize code text.Based on a dataset sampled from nine open source projects,we an-alyze developers' styles of using abbreviation in the coding process and finally use a series of heuristic algorithms to automatically expand abbreviation.The approach is based on direct static syntactic analysis,combined with fine-grained context of identifiers to accomplish the expansion.2.We proposed a semantic enhancement method for artifacts based on term pair and text structure.On the one hand,we extract term pairs from the source code to match with the requirement text to explore the potential relation between the requirement and source code? on the other hand,we consider the importance of each part of the requirement structure and propose a oblique multiplication strategy to strengthen the relations between artifacts.3.Based on our previous research,we validate the software traceability generation methods for use cases and issue texts.We combine the automated abbreviation expansion strategy with implicated semantic extraction of artifact texts and name it TRAVIS(IR-Based Traceability Recovery improved by Abbre Viation expansion and Implicated Semantics).In order to validate the effectiveness of our approach,we use two high-quality datasets that are widely used in the field and two open-source system traceable datasets that have been analytically collated and are widely used in daily engineering practice.
Keywords/Search Tags:Software traceability, Information retrieval, Abbreviation expansion, Im-plicated semantic extraction
PDF Full Text Request
Related items