Font Size: a A A

Research On Source Code Automatic Migration Technology Based On Deep Learning

Posted on:2024-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:M R XuFull Text:PDF
GTID:2568307091465344Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
To adapt legacy software systems to new hardware and software environments,developers need to rewrite software projects using new programming languages,which is time-consuming and error-prone.For this reason,researchers propose automatic source code migration techniques designed to reduce the burden on developers by converting source code from one programming language to another.Current studies typically use neural machine translation models to migrate source code,but these studies only improve the accuracy of migrated code from the perspective of abstract syntax trees,data flow or models,without considering the semantic consistency and syntactic accuracy of migrated code.To solve this problem,this paper designs source code migration models from three perspectives: code structure features,model architecture and dataset,which improves the accuracy of migrated code.First,to address the problem of semantic consistency of migrated code in the source code migration model,this paper proposes a source code migration model HPGN(Hierarchical Pointer-Generator Networks)based on a token attention mechanism and a statement attention mechanism.HPGN can align source code statements and token when decoding target code statements,thus improving the model performance.The experimental results show that HPGN improves the overall score by 3.4 over the best comparison model while having fewer model parameters.Then,this paper further extends the statement attention mechanism to Transformer model and proposes CSMAT(Code-Statement Masked Attention Transformer),a source code migration model based on the code-statement masked attention mechanism.The experimental results show that the model outperforms the comparison model in all metric scores,and the code-statement masked attention mechanism improves the performance of pre-trained models.Finally,to improve the accuracy of the CSMAT model for migrating code,this paper proposes a model VANOT(VAriable Name Obfuscation for Transformer)based on CSMAT model.The model uses variable name obfuscated target code for training,which enables the model to generate variable name independent target code based on source code data flow features.The experimental results show that the VANOT model outperforms the CSMAT model in the scores of all metrics,and pre-trained models based on the VANOT method can further improve the accuracy of the source code migration.
Keywords/Search Tags:deep learning, code migration, machine translation, attention mechanism
PDF Full Text Request
Related items