Ask a Question

Prefer a chat interface with context about you and your work?

BitPipe: Bidirectional Interleaved Pipeline Parallelism for Accelerating Large Models Training

BitPipe: Bidirectional Interleaved Pipeline Parallelism for Accelerating Large Models Training

With the increasing scale of models, the need for efficient distributed training has become increasingly urgent. Recently, many synchronous pipeline parallelism approaches have been proposed to improve training throughput. However, these approaches still suffer from two major issues, i.e., pipeline bubbles caused by periodic flushing and extra communication due to …