Font Size: a A A

A Class Of In-Place Linear Transformations Possessing The Cache-Oblivious Property

Posted on:2021-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZhaoFull Text:PDF
GTID:2428330602994307Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
Modern microprocessors provide a rich memory hierarchy,including various levels of cache and registers.Some of these memories(such as main memory and L3 cache)are relatively large,but they are slow and shared among all cores.Others(registers,L1 cache)are fast and only assigned to a small core.The cache size limits the performance of high-performance computing to some extent.In addition,due to the difference in access speed between the various layers of the hierarchical storage structure,the number of cache misses becomes another factor that affects the performance of high-performance computing.In this thesis,we consider the basic linear transformation algorithm,and we propose an in-place linear transformation algorithm with cache-oblivious properties.The in-place linear transformation algorithm allows the calculated output to be overlaid on the input,and only 0(1)extra storage space is used in the entire calculation process.We first introduce an in-place linear transformation algorithm based on block lower upper(LU)decomposition,and show that the proposed algorithm has cache-oblivious properties.At the same time,we combine the space-filling curve to map the transformation matrix and give a loop version of the algorithm.We analyze the number of cache misses in the progressive sense of the algorithm and give the relevant proof.We discussed the extended algorithm based on the proposed algorithm when the transformation matrix is a non-square matrix.and discussed the implementation method based on block lower diagonal upper(LDU)decomposition.We also give an in-place linear transformation algorithm based on permutation lower upper(PLU)decomposition suitable for arbitrary matrices.By using space-filling curves,the algorithm also has cache-oblivious properties.Finally,we conducted performance tests on the above algorithms through simulation experiments and compared them with existing algorithms.The simulation results show that the in-place linear transformation algorithm proposed in this thesis has a better cache hit ratio than the existing in-place linear transformation algorithm,and the average performance is improved by more than 30%.Compared with the out-of-place algorithm,the algorithm proposed in this thesis reduces the memory consumption by 50%.
Keywords/Search Tags:cache-oblivious, linear transformation, in-place algorithm, block LU decomposition, PLU decomposition
PDF Full Text Request
Related items