Optimizing Tensor Computation Graphs with Equality Saturation and Monte
Carlo Tree Search
Optimizing Tensor Computation Graphs with Equality Saturation and Monte
Carlo Tree Search
The real-world effectiveness of deep neural networks often depends on their latency, thereby necessitating optimization techniques that can reduce a model's inference time while preserving its performance. One popular approach is to sequentially rewrite the input computation graph into an equivalent but faster one by replacing individual subgraphs. This approach …