Memory-Efficient Dataflow Inference for Deep CNNs on FPGA
Memory-Efficient Dataflow Inference for Deep CNNs on FPGA
Custom dataflow Convolutional Neural Network (CNN) inference accelerators on FPGA are tailored to a specific CNN topology and store parameters in On-Chip Memory (OCM), creating the potential for high energy efficiency and low inference latency. However, in these accelerators the shapes of parameter memories are dictated by throughput constraints and …