Font Size: a A A

Design And Implementation Of Parallel SM4-GCM Based On CUDA

Posted on:2020-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:C X ZhangFull Text:PDF
GTID:2428330602951856Subject:Engineering
Abstract/Summary:PDF Full Text Request
Under the background of big data and 5G communication era,the safe and reliable transmission of information in high-speed network communication system has become a hot research topic,and one of the important research topics is to ensure the security,authenticity,integrity and non-repudiation of data transmission on the internet.SM4 is a block cipher algorithm widely used in industry to ensure the security of data,GCM algorithm provides the authentication of data,SM4-GCM is combination of GCM and SM4 algorithm,which provides both encryption and authentication of data.In recent years,parallel computing technology has developed rapidly,GPU has a powerful parallel computing power,become the preferred acceleration module of high-speed heterogeneous computing system.The main research goal of this thesis is to use the CPU-GPU heterogeneous computing model to achieve high-speed authentication encryption of data.The following aspects are implemented in the realization of this goal:1.Researched and summarized the authentication encryption scheme commonly used in industry,and then analyzed the advantages and disadvantages of it.Introduced the CUDA programming model,memory mode and access characteristics,CUDA execution model and the basic principle of SM4-GCM algorithm.2.Combined with the basis principle of SM4-GCM algorithm,the algorithm is analyzed by parallelization,the algorithm is divided into three main parts,and the serial and parallel tasks are divided.3.In order to achieve efficient data reading and writing,this thesis introduces the idea of hierarchical storage,using shared memory as a cache between global memory and registers,and based on the inbound characteristics of global memory and shared memory,two data storage models are designed.This thesis not only takes into account the access characteristics of global memory coalesced,but also avoids the problem of shared memory bank conflicts in the process of data caching.In order to solve the problem of two data modes changing with each other,this article also designed four sets of address offset lookup table,using the look-up table method to quickly determine the thread read and write address.The transformation of the two data storage modes with no bank conflicts between them is realized.This idea was also reflected is reflected in the subsequent encryption and authentication modules.4.In the process of designing the cryptographic kernel function,the SM4 round function was optimized accordingly,which reduced the consumption of register resources by the kernel,and adopted loop expansion to reduce redundant instructions.Combined with the theory of cryptography,the GCM certification work mode was improved.In the process of designing the host interface function,the pinned memory and stream are introduced,and the communication delay between the CPU and the GPU is hidden.5.Combined with the GPU-related parameters,the modules designed in this paper were tested accordingly,and the configuration of the relevant kernel functions was adjusted to obtain the optimal kernel configuration.Based on the optimal configuration,the performance indicators of the kernel were tested,and the results show that the kernel's various indicators had achieved the expected results.At the end of the section,the influence of different optimization measures on the performance of the module is compared,and the results are analyzed accordingly.Based on the techniques and methods studied,this thesis improves the parallelizaion of SM4-GCM authentication encryption algorithm,and the authentication encryption speed could reach 1.62GB/s,which satisfies the requirement of the current 5G communication technology for authentication encryption speed,the program portability is good,and the application prospect is broad.
Keywords/Search Tags:high-speed network, authentication encryption, CUDA, Parallel Computing
PDF Full Text Request
Related items