TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir
TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir
This work introduces TapirXLA, a replacement for TensorFlow's XLA compiler that embeds recursive fork-join parallelism into XLA's low-level representation of code. Machine-learning applications rely on efficient parallel processing to achieve performance, and they employ a variety of technologies to improve performance, including compiler technology. But compilers in machine-learning frameworks lack …