Supporting Massive DLRM Inference through Software Defined Memory
Supporting Massive DLRM Inference through Software Defined Memory
Deep Learning Recommendation Models (DLRM) are widespread, account for a considerable data center footprint, and grow by more than 1.5x per year. With model size soon to be in terabytes range, leveraging Storage Class Memory (SCM) for inference enables lower power consumption. This paper evaluates the major challenges in extending …