As artificial intelligence(AI)enters all aspects of our life,such as face recognition,driverless,machine translation,voiceprint recognition and intelligent customer service robot,the data generated is growing explosively.Then,the arithmetic logic unit(ALU)that processes data is required to greatly improve to meet the requirements of the “computing power era”.However,at present,most computing systems are based on von Neumann architecture,in which there is physical isolation between ALU and the memory.The power consumption and latency of data round-trip between the ALU and the memory are much larger than calculation.Therefore,computing in-memory(CIM)is a promising new computing method aimed at solving problems caused by von Neumann bottlenecks,which can meet the current needs.It eliminates large amounts of data transmitted between the processing and memory units,thereby significantly decreasing the latency and energy consumption.However,it is necessary to consider how to write back the calculation results for CIM.If only parallel computing is implemented,the restoring process will become a new bottleneck.In this paper,a bidirectional self-cycling SRAM array structure is proposed.The main core for SRAM includes self-cycling 8-transistor(8T)cell which is reconstructed on the basic 6-transistor(6T)cell.Two transistors are inserted between two cross-coupled inverters to realize basic read-write,CIM operations and row-wise copy operation.Basic read-write mode: by cutting off transistor between two inverters in unit,single-ended write operation can be realized.Due to the symmetry of proposed 8T and two-direction peripheral circuits,reading and writing are equivalent in row and column directions.CIM mode: The full-array Boolean logic operations can be achieved in two directions by multiplexing the power and ground ends of the cross-coupling inverter in the storage array.It does not require extra memory and can restore the results in in-situ bit cells in a single cycle.In addition,in the proposed bidirectional SRAM,any data row can be copied into another row by controlling the intermediate transistor of the 8T cell.To verify the effectiveness of the proposed CIM system,a 16 Kb SRAM was implemented in the 28 nm CMOS technology.The experimental results show that read margin of the proposed8 T SRAM cell was found to be 3.15 times higher than that of the conventional 6T,and it had good robustness in performing various operations.When performing the AES algorithm,the energy efficiency of proposed circuits is 4 times than von Neumann architecture.At the 0.66 V supply voltage,the logic operation power consumption of the chip is 3.7 f J/bit.At the 0.75 V supply voltage,the frequency of the chip reaches 113 MHz.Using this chip,the system energy efficiency was improved by 2.83–8.01 times in comparison to the existing CIM macros;the energy efficiency was as high as 270.5 TOPS/W. |