Font Size: a A A

Research On Key Technology Of Cross-architecture Application Migration For Multicores And Manycores

Posted on:2022-03-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:M L LiFull Text:PDF
GTID:1488306731498154Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
To meet the increasingly complex information processing requirements,parallel architectures based on multicores/manycores have been widely studied and used.To achieve different functional objectives,a variety of parallel architectures with different architectural characteristics and advantages are emerging.Parallel architectures increase the processing power but also make it more complex to develop programs on them.Developing parallel programs is the primary approach to utilize these parallel architectures.Compared with the rapid development of architecture,the compilation technology and migration technology for the new architecture are relatively backward.It is significant to research the technology of cross-architecture application migration for upgrading legacy code and improving the software ecosystem.This thesis researches the aspects of binary code parallelization in the migration of applications without source code and the migration of parallel applications with or without source code for the Sunway architecture.The main research contents and innovations are as follows:1.A dynamic-static combined binary translation framework based on LLVM,which is called LLPEMU is proposed.Dynamic binary translation is the key technology to implement the migration of applications across instruction set architecture,and advanced optimization of the code will introduce runtime overhead,which is detrimental to the overall performance improvement.A dynamic-static combination mechanism is designed to enable LLPEMU to optimize the code sufficiently without introducing runtime overhead.By leveraging superblocks generation technology in binary code and the LLVM intermediate representation,LLPEMU can generate high-quality code using a variety of code optimization methods.The experimental results show that,compared with QEMU,the overall performance of LLPEMU is effectively improved.2.A sequential code parallelization method for binary translation is proposed and implemented based on LLPEMU.This thesis applies the automatic source code parallelization technique to the binary translation system,to translate serial programs into equivalent parallel programs to take advantage of multi-core resources.A binary code parallelization mechanism is designed,and a code optimizing and refactoring method is proposed to overcome the obstacles to binary code parallelization in binary translation.The experimental results show that the proposed parallelization method for binary translation is feasible,and the performance of the generated code is effectively improved after parallelization.3.A native code replacement method based on address space reuse is proposed.Migration of MPI parallel programs towards the Sunway processor is implemented by replacing the MPI library function with native code.This research found that the existing method could not recognize the function call based on the jump instruction,resulting in a runtime error.To solve this problem,a native code replacement method based on address space reuse is proposed to implement the native code replacement of library functions by reusing the code address space of the program.The experimental results show that the proposed method can successfully identify function calls based on jump instructions and reduces runtime overhead introduced by native code replacement.4.Targeting on speeding up the migration of OpenMP programs to Sunway heterogeneous multi-core architecture,an OpenMP program translation mechanism for the Sunway many-core architecture is proposed and a runtime support library is implemented.Different from the Sunway heterogeneous many-core architecture,OpenMP is a programming standard for the shared memory architecture.According to the semantics of OpenMP programs,the translation mechanism transforms OpenMP programs into Sunway athread programs by performing the source-to-source conversion.In the runtime support library,some thread control mechanisms of the OpenMP model are implemented based on the mutex lock of the Sunway processor.The mutex lock of the Sunway processor faces the problem of memory congestion caused by the tense competition of locks.To solve the problem,a distributed lock mechanism with inter-core passing called HDT-LOCK is proposed.By using the single-instruction multiple-data instruction and the register communication mechanism between manycores,the inter-core passing of lock is implemented,which solves the problem of memory congestion and increases the throughput of the lock mechanism up to 5.6 times.
Keywords/Search Tags:Binary translation, Dynamic-static combination, Sequential program parallelization, Reusing address space, Native code replacement, Distributed lock with passing
PDF Full Text Request
Related items