Preliminary Program

Multi-ALM: Run-time Multi-Level Reconfigurable Approximate Logarithmic Multiplier

Maliha Tasnim, Chinmay Raje, Sheldon Tan
University of California Riverside

Abstract

This paper presents a novel multi-level approximate logarithmic multiplier (ALM) named multi-ALM that offers dynamic run-time re-configurability for various precision, performance, and power requirements. While previous research on ALM designs has focused primarily on trade-offs between performance, energy, and accuracy during design time, our work addresses the critical need for dynamic re-configurability at run-time to enable efficient power, performance, and quality management at the architecture and algorithm levels. The proposed multi-ALM is based on an innovative iterative formulation of logarithmic multiplication, resulting in a Taylor series-like formula. This formulation facilitates straightforward trade-offs between multiplication accuracy and power consumption. We demonstrate that traditional ALMs can be considered level 1 ALMs within the multi-ALM framework. Furthermore, we introduce a new four-level ALM MAC array architecture design that enables run- time reconfigurable MAC (Multiply-Accumulate) computing with customizable accuracy and performance settings. This architecture empowers system designers to adapt the ALM functionality on-the-fly, tailoring it to the specific requirements of different applications and scenarios. Numerical results show that 8-bit two-level multi-ALM can achieve up to 17.22×, 2.78×, and 1.37× improvement in mean error, peak error, and power consumption respectively over the baseline ALM, while the area increases by 1.40×. 16-bit two-level multi-ALM can achieve up to 17.5× and 2.75× improvement in mean error and peak error over ALM. Furthermore, we evaluate the proposed multi-ALM design in several multiplication and accumulation (MAC) applications and discrete cosine transformation (DCT) application. The result shows that, multi-ALM can effectively trade-off performance such as throughput per unit of resource consumption to provide better. accuracy of MAC computation, and quality of reconstructed image upon the conventional fixed configuration approximate multipliers