Preliminary Program

Dynamic Reconfiguration of CNNs for Input-Dependent Approximation

Maedeh Hemmat¹ and Azadeh Davoodi²
¹University of Wisconsin-Madison, ²University of Wisconsin, Madison

Abstract

In this work, we propose a novel framework which enables dynamic reconfiguration of an already-trained Convolutional Neural Network (CNN) in hardware during inference. The reconfiguration is for input-dependent approximation of the CNN to achieve power saving without much degradation in its classification accuracy. For each input, our framework uses only a fraction of the CNN’s edge weights based on that input (with the rest remaining 0) to conduct the inference. Consequently, power saving is possible due to the fewer number of fetches from off-chip memory as well as fewer multiplications for majority of the inputs. More specifically, we propose a clustering algorithm which groups similar weights in the CNN based on their importance, and gives an iterative framework which decides how many clusters of weights should be fetched from off-chip memory for each individual input. We also propose new hardware structures to implement our framework on top of a recently-proposed FPGA-based CNN accelerator. In our experiments with popular CNNs, we show significant power saving with almost no degradation in classification accuracy due to doing inference with only a fraction of the edge weights for the majority of the inputs.