Ask a Question

Prefer a chat interface with context about you and your work?

Benchmarking network fabrics for data distributed training of deep neural networks

Benchmarking network fabrics for data distributed training of deep neural networks

This work introduces TapirXLA, a replacement for TensorFlow's XLA compiler that embeds recursive fork-join parallelism into XLA's low-level representation of code. Machine-learning applications rely on efficient parallel processing to achieve performance, and they employ a variety of technologies to improve performance, including compiler technology. But compilers in machine-learning frameworks lack …