Towards Distributed Machine Learning in Shared Clusters: A Dynamically-Partitioned Approach
Towards Distributed Machine Learning in Shared Clusters: A Dynamically-Partitioned Approach
Many cluster management systems (CMSs) have been proposed to share a single cluster with multiple distributed computing systems. However, none of the existing approaches can handle distributed machine learning (ML) workloads given the following criteria: high resource utilization, fair resource allocation and low sharing overhead. To solve this problem, we …