SRAM Voltage Scaling for Energy-Efficient Convolutional Neural Networks

Lita Yang and Boris Murmann
Stanford University


Abstract

Motivation: Convolutional neural networks (ConvNets) are achieving state-of-the-art results in a wide range of classification tasks. Problem: However, deployment of these networks in power-limited IoE and mobile devices still remains a challenge due to the high memory power consumption. Approach: To tackle this problem, we propose to exploit the error resilience of ConvNets and accept bit errors under reduced SRAM voltages to reduce the power consumption of hardware ConvNet implementations. Though extensive study on the effects of quantization errors in computation for ConvNets has been done, there is limited literature on how the network performs under the presence of bit errors from voltage-scaled SRAM. Objective: Motivated by the need to reduce memory power consumption and to better understand error resilience of ConvNets to bit errors, we present the first silicon-validated study on exploring the limits of SRAM voltage scaling in ConvNets.

The main contributions of our work are:

1) We demonstrate that SRAM supply voltage scaling in ConvNets can achieve memory voltage reduction by 280mV, resulting in 4.8x leakage power reduction and 2.6x memory access power reduction. 2) Further voltage reduction can be achieved by training with low-voltage SRAM and injecting bit errors during training, achieving voltage reduction by 310mV from the nominal, and this results in 5.4x leakage power reduction and 2.9x memory access power reduction. 3) Finally, we present a case study demonstrating that optimal training and testing sweeps to try to match error distributions between training and test phases can improve classification error rates closer to floating-point performance.