Font Size: a A A

Research On Prediction Of Code Timeout Problem Based On Deep Learning

Posted on:2021-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:M L ZhouFull Text:PDF
GTID:2428330605982483Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous expansion of the software product market,the updating of software operating equipment,and the gradual maturity of the software development process,the performance requirements of software users are becoming increasingly prominent.Performance issues are essentially related to the source code.For solving the same problem,different programmers may write completely different "correct" code with the same functionality but have different performance.Most online judge system on programming make use of automated grading systems,usually rely on test results to quantify the correctness and performance for the submitted source code.However,traditional dynamic testing takes a lot of time,and the discovery of performance problems is usually after the fact even for those small-scale programs.Due to differences in code structure or semantics,their operating efficiency may be completely different.In recent years,with the popularity of deep learning,more and more researchers try to use deep learning models to solve problems of software engineering from the statistics and induction of a large number of open source code data.Aiming at the code time limit exceed(TLE)problem of programming contest websites,this paper extracts code features from two perspectives of code semantic features and structural features to build a deep learning prediction model,and verifies the validity of the method in the real data set.The innovations and main work of this paper are as follows:1.Regarding the semantic characteristics of the code,the code written by different developers is completely different,including the way of naming variables,differences in various languages,and so on.In order to more effectively abstract the semantic characteristics of the code,this paper also proposes a novel way of code tokenization,which establishes different tagging rules for different elements in the code.At the same time,this paper also proposes a prediction method for code timeout problem based on attention-based LSTM neural network.2.For the structural features of the code,we selected the code control flow graph as the extraction object of the structural features.This paper proposes a prediction method for code timeout problem based on the PSCN model,which mainly extracts effective features from the code control graph.This paper proposes a code control flow graph generation algorithm.In order to make deep learning models effectively learn and build predictive models,this paper uses the method of graph embedding to normalize graphs and input them to deep learning models for learning features.Finally,this paper uses code semantic features and structural features to fuse and construct a new classifier and verifies its effectiveness in experiments.3.To verify the effectiveness of our method,we collected source code in different languages on Codeforce to train and build predictive models.In the prediction model experiments based on semantic features,we compare different labeling methods to obtain the optimal labeling method.Compared with other deep learning models,the method proposed in this paper has better prediction effect.In the prediction model experiments based on structural features,we compared different control flow graph construction methods to obtain the optimal graph construction method.The final experimental results show that,compared with traditional dynamic testing,the method proposed in this paper can save a lot of time and cost,and the prediction result of the timeout problem of the code reaches a certain accuracy rate.
Keywords/Search Tags:Software Engineering, software performance, defect prediction, feature fusion, deep learning
PDF Full Text Request
Related items