Image Quantization Tradeoffs in a YOLO-based FPGA Accelerator Framework

Richard Yarnell, Mousam Hossain, Ronald DeMara
University of Central Florida


Abstract

Until recently, FPGA-based acceleration of convolutional neural networks (CNNs) has remained as an open-ended research problem. Herein, we evaluate one new method for rapidly implementing CNNs using industry-standard frameworks within Xilinx's System-on-a-Chip (SoC) devices. Within this workflow referred to as Framework for Accelerating YOLO-Based ML on Edge-devices (FAYME), a TensorFlow model of the You Only Look Once version 4 (YOLOv4) object detection algorithm is realized using Xilinx's Vitis AI toolchain and evaluated on a Xilinx UltraScale+ SoC FPGA development platform. We test various levels of model bit-quantization and evaluate performance using mean average precision (mAP) while simultaneously analyzing the utilization of available memory and processing elements. Additionally, we also implement ResNet-50 model to provide additional performance comparisons.