Font Size: a A A

Parallelization Research Of Quantum Fourier Transform Algorithm For Domestic DCU

Posted on:2023-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:K MaFull Text:PDF
GTID:2530306620987379Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The "Songshan" supercomputer system adopts a large-scale heterogeneous architecture of domestic CPU processor + DCU accelerator,and its performance ranks among the international advanced supercomputing ranks.Quantum Fourier Transform(QFT)is the key core of quantum algorithms such as Shor algorithm and quantum phase estimation algorithm.In order to verify the feasibility of the simulation of quantum computing related algorithms on this platform,and to promote the research and optimization of quantum algorithms for the domestic "Haiguang-1" DCU accelerator.In this thesis,the Quantum Fourier Transform simulation algorithm is selected to be implemented and optimized on the "Songshan" supercomputer platform.The main work of is as follows:(1)The quantum circuit calculation model is selected as the simulation model,and the libquantum library is used as the basis for the work.First,the research and hot spot analysis of the QFT simulation program serially implemented on the CPU are carried out,the running time of the two quantum logic gates,the phase-shift transform R gate and Hadamard gate in the quantum circuit of the QFT algorithm accounts for 90% of the total running time of the program;The operation of the phase-shift transform R-gate on the probability amplitude information is computationally intensive,so it is suitable for parallel computing.Then,according to the HIP heterogeneous programming model,the device functions of the phase shift transform R and Hadamard gate operation are written,and the computational hot code segment of the QFT simulation is mapped to the DCU implementation.Furthermore,the operation of quantum logic R gate on probability amplitude information in the simulation algorithm is analyzed,and an optimization scheme is proposed to improve the activity of DCU thread,which further improves the operation efficiency of phase shift transform R gate on DCU,and it lays the foundation for the expansion of QFT algorithm simulation to multi-chip DCU accelerators(2)In order to solve the problem that the scale of qubit simulation cannot continue to be expanded due to the memory limitation of a single DCU,this thesis conducts research on multi-DCU heterogeneous QFT simulation.Firstly,it is extended to two DCUs,and the HIP stream technology is used to realize the concurrent transmission of probability amplitude information of the two DCU accelerators on the host side and the device side and the concurrent operation on the DCU;By designing the overlap between the quantum logic gate operation on the DCU and the data communication from the host to the device,the communication delay of the probability amplitude transmission is reduced,and the utilization of the host and the DCU accelerator is improved.(3)When using four DCUs for larger qubit-scale QFT simulation research,with the further increase of the quantum information data scale,the stream technology excessively occupies the host-side memory,leads to a reduction in the overall efficiency of the QFT simulation.Through the analysis of the algorithm and platform architecture,this thesis uses the mixed programming mode of MPI+HIP-C to realize the concurrency of data transmission and the operation of quantum logic gates,which further improves the efficiency of QFT simulation for four-chip DCU accelerators;After analysis,when four cards are in parallel,the solution of dividing the probability amplitude information evenly will lead to uneven distribution of computing tasks on each DCU..This thesis proposes a task division scheme for different DCU accelerator memories,so that the calculation tasks of the simulation program on each DCU accelerator are the same as possible;Finally,the memory access optimization is carried out for the global memory of the DCU accelerator,and the merging of data reads is adopted to improve the memory access speed between the DCU accelerators.The simulation optimization scheme in this thesis has been experimentally verified on the "Songshan" supercomputer platform,and a 28-qubits Quantum Fourier Transform simulation on a single computing node has been realized.Compared with the CPU serial implementation,the quantum Fourier transform simulated by the optimization scheme for data transmission and the optimization scheme for the DCU accelerator in this thesis has achieved a speedup of 20.275.The basic quantum gate optimization method and memory optimization scheme for DCU acceleration equipment in this thesis lays the foundation for the development of application-level multi-node quantum algorithm simulation on this platform,and provides a reference for the efficient simulation of other quantum algorithms on domestic heterogeneous platforms.
Keywords/Search Tags:Quantum Fourier Transform, quantum circuit, quantum logic gate, DCU Accelerator, HIP Heterogeneous Programming Model
PDF Full Text Request
Related items