Font Size: a A A

Research On Neural-Guided Inductive Program Synthesis

Posted on:2024-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:C Y YangFull Text:PDF
GTID:2568307058472084Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Program synthesis aims to automatically generate program code that satisfies user requirements based on the specification description of the program.Since its inception,program synthesis has received considerable attention.Inductive Program Synthesis(IPS),as an important branch of program synthesis,uses Input-Output(IO)examples as the specification description of the program to synthesize computer programs that align with the user’s intentions.IPS has gained wide adoption in recent years due to its concise form and user-friendly nature,making it a hot research topic in the field of program synthesis.For each state in the process of transforming the input state to the output state during the example,the sequential order corresponds to each statement of the target program,serving as a descriptive specification for that statement.However,IO examples may encounter stagnation during the transition,where the state remains unchanged,resulting in an incomplete specification description of the target program and potential missed judgments.Furthermore,the ambiguity of examples leads to unclear boundary descriptions of program specifications and potential misjudgments of the target program.Additionally,the IPS models are often optimized using maximum likelihood estimation,which primarily focuses on maximizing the probability of a specific labeled program and assigns higher losses to other equivalent programs that can transform the input into the output.This results in weak generalization capabilities of the model,as it fails to generate any program that satisfies the examples.To address the aforementioned issues,this dissertation explores inductive program synthesis methods based on IO examples to improve the success rate and generalization of the model in synthesizing programs.The main research content is summarized as follows:(1)The Neural Two-Stage Program Synthesis(N-TSPS)method,based on neural network inspiration,is proposed.It divides program synthesis into two stages: local solution program generation and global solution program generation.In the local solution program generation stage,easily satisfiable IO examples are used to prioritize the synthesis of local solution programs that satisfy partial examples.By synthesizing local solution programs,the specification description of each example for the target program is adequately expressed.In the global solution program generation stage,the example states that incorporate contextual information from local solution programs are embedded as vectors into the model’s input,aiding in the generation of a global solution program that satisfies all the examples.(2)A program synthesis model called Model with Integrated Contextual Information(MICI)is proposed.It combines the contextual information of the surrounding statements with the example states,enhancing the features by incorporating the relational information between the surrounding statements.In this approach,a triplet structure type is designed,consisting of a program,statement index,and example index.This structure defines the example state after executing a specific program statement.The information fusion layer utilizes a multi-head attention mechanism to adjust the attention distribution of each example based on the similarity between the embedded vectors of the example states and the embedded vectors of the states executed line by line in the local solution program.This enhances the discriminative features for complete example program specification descriptions.(3)The Reinforcement Learning Guided Inductive Program Synthesis(RL-GIPS)approach is proposed,which combines reinforcement learning with supervised models.In RL-GIPS,the states of the environment in reinforcement learning are treated as the states of the supervised model,and the program statements acting on the examples are considered as actions in the reinforcement learning setting.A reward function is introduced into the objective function,which evaluates multiple equivalent programs returned by the search algorithm based on their ability to achieve the desired functionality.From these evaluations,a distribution with smaller support is learned,and the parameters of the model are obtained.In this dissertation,the time limit for synthesizing each program is set to within 5seconds,and experiments are conducted on the E1 and E2 datasets.The E1 dataset contains programs with only 4 statements,while the E2 dataset includes programs with lengths of5,8,10,12,and 14 statements.The experimental results demonstrate that compared to the baseline methods,the N-TSPS method achieves a success rate of 88.62% on the E1 dataset.On the E2 dataset,the success rate improves by 1.42% to 2.65%,alleviating the impact of inaccurate program specification descriptions caused by example ambiguity to some extent.In cases where the example’s specification description is incomplete,the MICI method improves the success rate by 2.63% to 4.63%,mitigating the interference caused by incomplete specification descriptions in accurately synthesizing programs.Furthermore,the RL-GIPS method significantly enhances the generalization capabilities of DeepCoder,PCCoder,N-PEPS,and MICI on the E1 and E2 datasets,confirming the effectiveness of this approach in improving the generalization ability of IPS models.
Keywords/Search Tags:Artificial Intelligence, Software Engineering, Program Synthesis, Input-Output Examples, Functional Programming
PDF Full Text Request
Related items