| In structural testing,infeasible paths could lead to a mass of waste of testing costs,which greatly reduce the generation efficiency of test case.If the path feasibility can be determined in advance,the effective test resources can be fully used to generate test cases for feasible paths.The traditional ways to determine path feasibility include static-way,dynamic-way and hybrid-way.Static method is strongly related to programming language,and the path solving cost is expensive.Dynamic method relies on the generation of test data and executes inefficiently.Although the hybrid method improves the efficiency of the solution,it is still limited by the traditional ways.Currently,there is no efficient and universal method to determine the infeasible path especially for large-scale programs.Regarding the issue above,this paper proposes a method to determine the feasibility of program paths based on machine learning.It converts the problems of path-feasibility-determination to the path-feasibility-classification,and uses machine learning to improve the efficiency of judgment,which has a certain generality.Focusing on this idea,the paper mainly does following research:(1)The paper proposes a method to extract path feature based on path constraints.Firstly,extract path constraints based on symbolic execution,and simplify constraints by adopting methods like constraint atomization,invalid constraint deletion,etc.Then,encode the simplified constraints.Combine the coding constraints and the frequency of keywords together to form a feature space.(2)The paper proposes a method to determine path feasibility based on machine learning.First,generate the program path in the pre-processing stage.The feasibility of the path is marked through interval arithmetic,dynamic execution and manual confirmation.Then extract constraints and keywords as features to perform vectorization.At last,conduct some different classification models training based on encoded features and evaluate them according to several acceptance criteria.Utilizing machine learning method to determine the path feasibility can be applied to both the intra-process path and the inter-process path with relatively good scalability.Based on above methodology,we constructed a logistic regression model,a support vector machine model,a random forest model and three ensemble classification models,selected 10 generation process paths in open-source C language project to cross-validate the path feasibility.Experimental results show that the method proposed in this paper can provide path feasibility judgment for tens of thousands of lines of source code,and the judgment accuracy of all models can reach more than 80%. |