Ask a Question

Prefer a chat interface with context about you and your work?

QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices

QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices

A number of production deep learning clusters have attempted to explore inference hardware for DNN training, at the off-peak serving hours with many inference GPUs idling. Conducting DNN training with a combination of heterogeneous training and inference GPUs, known as hybrid device training, presents considerable challenges due to disparities in …