MACcelerator: Approximate Arithmetic Unit for Computational Acceleration

Alice Sokolova1, Mohsen Imani2, Andrew Huang1, Ricardo Garcia1, Justin Morris3, Tajana Rosing4, Baris Aksanli5
1University of California San Diego, 2University of California Irvine, 3University of California, San Diego, 4UCSD, 5San Diego State University


As computationally expensive applications such as neural networks gain popularity, approximate computing has emerged as a solution for significantly reducing the energy and latency costs of extensive computational workloads. In this paper, we propose a highly accurate approximate floating point Multiply-and-Accumulate (MAC) unit for GPUs which significantly decreases power and delay costs of a MAC operation. We propose an intelligent input analysis scheme to approximate the addition stage of a MAC operation and an efficient Approximate Multiplier to simplify the multiplication stage. Our design has tunable accuracy, offering the flexibility of exchanging accuracy for increased efficiency. We evaluated our proposed design over a range of multimedia and machine learning applications. Our design offers up to 2.18x and 3.21x EDP improvement for machine learning and multimedia applications respectively while providing comparable quality to an exact GPU.