TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir
TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir
This work introduces TapirXLA, a replacement for TensorFlow's XLA compiler that embeds recursive fork-join parallelism into XLA's low-level representation of code. Machine-learning applications employ a variety of technologies to improve performance, including compiler technology. But compilers in machine-learning frameworks lack a deep understanding of parallelism, causing them to lose performance …