Back To Schedule
Wednesday, July 10 • 3:00pm - 3:20pm
STRADS-AP: Simplifying Distributed Machine Learning Programming without Introducing a New Programming Model

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

It is a daunting task for a data scientist to convert sequential code for a Machine Learning (ML) model, published by an ML researcher, to a distributed framework that runs on a cluster and operates on massive datasets. The process of fitting the sequential code to an appropriate programming model and data abstractions determined by the framework of choice requires significant engineering and cognitive effort. Furthermore, inherent constraints of frameworks sometimes lead to inefficient implementations, delivering suboptimal performance.

We show that it is possible to achieve automatic and efficient distributed parallelization of familiar sequential ML code by making a few mechanical changes to it while hiding the details of concurrency control, data partitioning, task parallelization, and fault-tolerance. To this end, we design and implement a new distributed ML framework, STRADS-Automatic Parallelization (AP), and demonstrate that it simplifies distributed ML programming significantly, while outperforming a popular data-parallel framework with a non-familiar programming model, and achieving performance comparable to an ML-specialized framework.


Jin Kyu Kim

Carnegie Mellon University

Abutalib Aghayev

Carnegie Mellon University

Garth A. Gibson

Carnegie Mellon University, Vector Institute, University of Toronto

Eric P. Xing

Petuum Inc, Carnegie Mellon University

Wednesday July 10, 2019 3:00pm - 3:20pm PDT
USENIX ATC Track II: Grand Ballroom VII–IX