Font Size: a A A

Data Characteristics-Driven Software System Textual Feature Location Method Research

Posted on:2020-09-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y HeFull Text:PDF
GTID:1488305753971989Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
Feature location aims to determine the mapping relationship between the features of evolution tasks and the source code components.The evolution of software system is represented externally by the changes of functional entities,which are called features.Software evolution is the software modification behaviors driven by user's functional or non-functional expectation bias.To implement effective software evolution activities,there are two important problems need to be clarified:I.where to evolve;2.how to evolve.The former is a prerequisite for the latter.Feature location is an effective method to solve the problem of "where to evolve".Feature location can help to control the efficiency and cost of software system's evolution.Textual feature location is a hot research topic in the research field of feature location of software systems.Noise,index model selection and location process configuration are the three main problems in the research of textual feature location methods.The software system's source code is essentially a special format of text data.This paper aims to solve these problems by the ideal of data characteristics-driven,to find solutions from the source code data itself.The main research contents and results are as follows:1.Text feature location method of software system with part-of-speech filtering algorithm.By the problem of there are a lot of noise data in source code corpus,all the existing feature location methods filter the noise data only depends on the stop words list.But,this way can't filter the noise data adequately.This paper proposed a source code corpus preprocess method based on part-of-speech filtering algorithm.In the preprocessing step,part-of-speech tagging is performed on all the vocabularies in the corpus,then the vocabularies which with noise tag are filtered.2.Weighted semantic similarity algorithm driven by software system's structural information.The index models used in existing textual feature location methods can be divided into two categories,which are called bag of word model and word embedding model.Each kind of indexing model has its advantage and disadvantage.To address this problem,this paper proposed a weighted semantic similarity algorithm driven by the software system's structural information.The integration process relies upon the degree of cohesion and coupling of the source code's structure.3.Weighted semantic similarity algorithm based on discrimination.Weighted semantic similarity algorithm driven by software system's structural information is constrained by the structural information of software system.In some locating tasks of software system,the structure of software system is not obvious,then the algorithm cannot be performed.To solve this problem,a weighted semantic similarity algorithm based on discrimination is proposed.The method defines the similarity distribution generated by the index model as discrimination,and integrates the similarity generated by different index methods with discriminations as weight.4.A self-adaptive textual software feature location process configuration method.The existing textual feature location methods are based on traditional natural language processing and information retrieval technology,ignoring the grammatical features of the software source code that are different from those of a natural language.Therefore,the existing textual feature location process fails to reflect the software source code data characteristics,thus limiting the performance of the textual feature location methods to a certain extent.To solve this problem,herein,this paper propose a self-adaptive textual software feature location process configuration method.The proposed method is implemented on the basis of a genetic algorithm and uses a small amount of sample data as the input to automatically identify and configuration the optimal textual feature location process.Based on the characteristics of the software system's source code,this paper constructs a data characteristics-driven software system textual feature location research.This research distinguishes the software system textual feature location from the traditional text information retrieval.It is different from the existing textual feature location method which directly refers to the existing natural language processing and information retrieval technology.In the process of textual feature location,this research improves the performance of feature location method based on different software systems'characteristics.
Keywords/Search Tags:Software system, Evolution, Feature location, Part-of-speech filtering, Weighted semantic similarity, Self-adaptive
PDF Full Text Request
Related items