Processing-In-Memory Acceleration of Convolutional Neural Networks for Energy-Efficiency, and Power-Intermittency Resilience

Arman Roohi1, Shaahin Angizi2, Deliang Fan3, Ronald F DeMara3
1Computer Systems and Architecture Laboratory, Department of EECS, University of Central Florida, 2Department of Electrical and Computer Engineering, University of Central Florida, 3University of Central Florida


Herein, a bit-wise Convolutional Neural Network (CNN) in-memory accelerator is implemented using Spin-Orbit Torque Magnetic Random Access Memory (SOT-MRAM) computational sub-arrays. It utilizes a novel AND-Accumulation method capable of significantly-reduced energy consumption within convolutional layers and performs various low bit-width CNN inference operations entirely within MRAM. Power-intermittence resiliency is also enhanced by retaining the partial state information needed to maintain computational forward-progress, which is advantageous for battery-less IoT nodes. Simulation results indicate ~5.4× higher energy-efficiency and 9× speedup over ReRAM-based acceleration, or roughly ~9.7× higher energy-efficiency and 13.5× speedup over recent CMOS-only approaches, while maintaining inference accuracy comparable to baseline designs.