| The inverse design of micro/nano photonic devices is a design method that obtains device parameters and deduces device structure from target function through global optimization algorithm and electromagnetic field solution.Inverse design cycle based on the global optimization algorithm to optimize the device parameters of thousands of times,with each cycle need to compute the electromagnetic field distribution,through frequency domain finite difference method for parallel computing(FDFD)discretization of electromagnetic field space,using GPU performs iterative algorithm to solve the electromagnetic field distribution,each time to solve the electromagnetic field to serial tens of thousands of times the GPU computing tasks,So the inverse design takes dozens or even hundreds of hours.With the rapid development of cloud computing and HPC,the target device has the characteristics of larger computing scale and finer structure.How to efficiently utilize the computing power of simulation platform to accelerate the electromagnetic field solution has become a research hotspot in recent years.This paper focuses on the acceleration technology of FDFD electromagnetic field solution,and the main contents of the paper are as follows:(1)By analyzing the effects of device size,device type,number of parallel tasks and graphics card specification on the performance of electromagnetic field solution,the direct time consuming factor of electromagnetic field finite difference solution in electromagnetic field frequency domain is studied.The program characteristics,call structure and performance bottleneck of FDFD solver are analyzed under sufficient computational power.The communication speed between multiple Gpus in the reverse design simulation platform is studied,and the strategy of giving priority to two Gpus in the same chipset for parallel computing is determined.(2)The paper extends the CPU electromagnetic field solver(FD3D solver)to GPU solver,and the solving speed of small-scale devices is 1.3 times that of FDFD solver.FD3D’s multi-GPU parallel computing mode can split the original data to each computing node,breaking the limit of single GPU computing power,and has better scalability and acceleration ratio than FDFD solver.The average elapsed time of iteratively solving a small scale device with 1,2 and 4 Gpus is 10.1 seconds,5.82 seconds and 4.45 seconds,respectively.The average elapsed time of iteratively solving a large scale device is 392 seconds,198 seconds and 112 seconds,respectively.In the case of parallel computing with multiple Gpus,the performance of FD3D solver is higher than that of FDFD solver.(3)In the case of insufficient computing power,the single GPU workflow of FDFD solver was analyzed to test the communication time between CPU and GPU and GPU computing performance,optimize vector summations and data interaction strategies,simplify synchronous operation and calculation process,and reduce the electromagnetic field solving time.The results show that the average iterative single step time is reduced by 25.57%when solving small-scale devices,and the overall simulation saves 95 minutes.When solving large-scale devices,the average iterative single step time is reduced by 13.24%,and the overall simulation saves 425 minutes,which provides feasibility for large-scale,fine and integrated development of micro-nano photonic devices. |