An Efficient Processing In Memory Framework For Convolutional Neural Networks Using The Second Generation Racetrack Memory

Posted on:2019-09-13

Degree:Master

Type:Thesis

Country:China

Candidate:B C Liu

Full Text:PDF

GTID:2428330566460758

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

As a new computing paradigm,Processing In Memory(PIM)allows the parallel computation in both processors and memories,which drastically reduce the movements between computation units and storage units.Therefore,PIM can be considered as an efficient technology to somewhat address the shortcomings of the Von Neumann architecture.Compared with traditional random access memory,racetrack memory has many merits including high density,non-volatility,and low static power.Therefore,it can be used for efficient PIM computing.To address the shortages of convolution neural networks,this paper proposes a novel PIM framework based on the skyrmion material.In this framework,we use skyrmion-based racetrack memories to construct storage units,and use skyrmion-based logic gates to compose both adders and multipliers for the computation units.Since our framework does not need CMOS circuits to assist the underlying computation unit construction,the design complexity is significantly reduced.Meanwhile,based on our proposed optimization methods for read and write operations at the circuit layer and address mapping mode of the memory at the system level,the performance of our framework is drastically improved.In order to solve the problem of how the different hierarchical functions of the CNNs can be realized in the skyrmion based PIM architecture,various optimization design methods are proposed in this paper.This method can effectively support the operation of CNNs under the new type of PIM framework,while fully utilizing the advantages of skyrmion based racetrack memory.The main contributions of this article are as follows:1.The second-generation racetrack memory not only has the storage function,but is also naturally suitable for implementing the computing function.This paper makes full use of this feature to design skyrmion based full adders and multipliers to perform specific calculations in memory.The computing unit composed of such adders and multipliers does not need a large number of CMOS circuits to assist,which greatly reduces system implementation complexity and also reduces system power consumption.2.The second-generation racetrack memory,as a new type of non-volatile storage,is fundamentally different from the traditional random access memory(RAM)in terms of physical structure: in addition to the traditional read and write opera-tions,it also has the shift operation.So it cannot be directly replaced by an existing random access memory.In response to these problems,this paper redesigns and optimizes non-volatile memory cells based on the skyrmion racetrack memory.At the same time we proposes a memory address mapping method specifically for this new type of storage structure,thereby greatly reducing the total shift operation and ultimately improves the overall memory computing framework's operational efficiency.3.This paper designs and implements a method that enables a universal convolutional neural network to be efficiently and correctly implemented in the skyrmion based PIM framework.This method is to layer the CNNs by function and process the functions of different levels separately.For most of the simpler operations,such as matrix multiplication and averaging,This method decomposes these operations into addition and multiplication which can be performed directly in our proposed PIM architecture.For the other complex operations that cannot be resolved,such as derivation,this paper presents Two ideas: One is to transfer data to the general-purpose processor(CPU)to process,and the other is to use a look-up table to obtain an approximate value.Experimental results show that compared to the state-of-the-art domain wall-based PIM framework,our approach can achieve 52.1% time improvement and 40% energy savings on average.

Keywords/Search Tags:

racetrack memory, nonvolatile, processing in memory, address mapping, convolutional neural network

PDF Full Text Request

Related items

1	Research On Memory Mapping Methods Of Reconfigurable And SIMT Processor System Architectures
2	Research On NVM Based Main Memory Key Technology
3	Optimal And Design Of Convolutional Neural Network Based On Processing In Memory
4	Development Of 8Kb Nc-Si Nonvolatile Memory Prototype Device With NOR Memory Function And Investigation Of Device Physics
5	Rethinking the memory hierarchy design with nonvolatile memory technologies
6	Research On Nonvolatile Average 7T1R Static Random Access Memory Based On RRAM
7	Research On Novel High-Density Nonvolatile Memories
8	Embedded FLASH Based Processing-in-memory Architecture For Convolutional Neural Network
9	Design Of Storage-compute Fusion Circuit For Spin Magnetic Memory For Convolutional Neural Network
10	Research On Processing-in-memory Architecture For Neural Network Computation In ReRAM-based Main Memory