Font Size: a A A

Research On Automatic Tracing Of Software Process Artifacts Based On Software Repository And IR Model

Posted on:2021-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:T T ZhangFull Text:PDF
GTID:2428330647950880Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Software traceability is the ability to interrelate any uniquely identifiable software engineering artifact to any other,maintain required links over time,and use the resulting network to answer questions of both the software artifact and its development process.Software traceability studies the traceability relationships between various artifacts,which can be used to understand such artifacts and the artifacts associated with them,and support software maintenance and update.Software traceability is an essential part of the approval and certification process of most safety critical systems to ensure the correctness of safety systems.The existing research mainly focuses on the tracking relationship between requirements and source code.With the help of information retrieval model,the mainstream analysis methods generate ranking list of candidate tracking links based on text similarity.However,due to lexical mismatches,irregular naming of code elements and other low-quality texts,the precision and recall can not be satisfied as well.And because of the different semantic environment,it brings new challenges to the information retrieval method based on text similarity.But the current research is mostly limited to requirements,lack of exploration on more artifact tracking relationships,and the verified projects are mostly open-source projects and student projects.The tracking practice based on industrial data needs to be supplemented.The thesis proposes a method of automatic artifact tracking,and explores the possibility of automatic recovery of multiartifact association.The method is divided into two parts: for the artifacts with explicit association,the features will be learned from the tracked links,the classifier will be trained,and the untracked links will be recovered.For the artifacts with implicit association,the tracking relationship will be established by mining the relevant features with the help of the intermediate artifacts.Specifically,from the perspective of process related information and text context of artifact,the process of mining the characteristics of various artifacts and building the tracking relationship between artifacts is simplified as the binary classification problem of machine learning,and the tracking relationship between artifacts is predicted by using the classifier.In this thesis,five different classifiers are used to evaluate the effect of the tracking model from the perspectives of sampling method,model comparison and feature analysis.Using the method proposed in this thesis,five kinds of tracking relationships among four types of artifacts are restored,and the applicability of the method is verified by five industrial projects.The results show that among the five classifiers,the random forest is the best.Among the three feature combinations,the combination of text similarity and artifact production process achieves the best performance.The average F1 of four kinds of explicit artifact tracking is about 0.77,and the average F1 of implicit artifact tracking is 0.86.
Keywords/Search Tags:Software Traceability, Software Artifacts, Information Retrieval Model, Automatic Tracing, Feature Extraction, Software Development Process
PDF Full Text Request
Related items