Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA
Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA
Convolutional neural network (CNN) dataflow inference accelerators implemented in Field Programmable Gate Arrays (FPGAs) have demonstrated increased energy efficiency and lower latency compared to CNN execution on CPUs or GPUs. However, the complex shapes of CNN parameter memories do not typically map well to FPGA on-chip memories (OCM), which results …