Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI

Geng Yuan1, Zhiheng Liao2, Xiaolong Ma1, Yuxuan Cai1, Zhenglun Kong1, Xuan Shen1, Jingyan Fu2, Zhengang Li1, Chengming Zhang3, Hongwu Peng4, Ning Liu1, Ao Ren5, Jinhui Wang6, Yanzhi Wang1
1Northeastern University, 2North Dakota State University, 3Washington State University, 4University of Connecticut, 5Chongqing University, 6University of South Alabama


Abstract

Recent research demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication—the intensive and key computation in deep neural networks (DNNs). However, hardware failure, such as stuck-at-fault defects, is one of the main concerns that impedes the ReRAM devices to be a feasible solution for real implementations. The existing solutions to address this issue usually require an optimization to be conducted for each individual device, which is impractical for mass-produced products (e.g., IoT devices). In this paper, we rethink the value of weight pruning in ReRAM-based DNN design from the perspective of model fault tolerance. And a differential mapping scheme is proposed to improve the fault tolerance under a high stuck-on fault rate. Our method can tolerate almost an order of magnitude higher failure rate than the traditional two-column method in representative DNN tasks. More importantly, our method does not require extra hardware cost compared to the traditional two-column mapping scheme. The improvement is universal and does not require the optimization process for each individual device.