High-Performance NoCs employing the DSP48E1 blocks of the Xilinx FPGAs

Prabhu Prasad B M1, Khyamling parane2, Basavaraj Talawar3
1National Institute of Technology Karnataka, Surathkal, 2National Institute of Technology Karnataka, 3CSE, NITK, Surathkal


The hard multiplexers of the Xilinx DSP48E1 slices have been employed to support the functionality of crossbar switch of the buffered five port Network-on-Chip (NoC) routers. This is possible due to the dynamic mode operation of the DSP48E1 slices per clock cycle based on the multiplexer control signals. As a result of this, a significant reduction in the soft logic (LUT+FF) utilization of the FPGA implementation of the 6 × 6 Mesh topology has been observed. DSP based crossbar implementation of the 6 × 6 Mesh topology consumes 36% fewer LUTs and 40% fewer FFs than the LUT based crossbar implementation. 38% less power consumption has been observed in the DSP based implementation. The proposed work utilizes 41% fewer LUTs compared to the state-of-the-art CONNECT NoC generation tool. The latency reductions of 31% and 38% have been achieved by the proposed DSP48E1 based crossbar implementation over the LUT crossbar implementation of 8 × 8 Mesh topology under the Uniform and Transpose traffic patterns. Also, the proposed DSP48E1 based implementation achieves the saturation throughput improvements of 1.4× and 1.6× over the LUT based implementation under Uniform and Transpose traffic patterns respectively.