Loading…
Back To Schedule
Wednesday, July 10 • 2:40pm - 3:00pm
Tangram: Bridging Immutable and Mutable Abstractions for Distributed Data Analytics

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Data analytics frameworks that adopt immutable data abstraction usually provide better support for failure recovery and straggler mitigation, while those that adopt mutable data abstraction are more efficient for iterative workloads thanks to their support for in-place state updates and asynchronous execution. Most existing frameworks adopt either one of the two data abstractions and do not enjoy the benefits of the other. In this paper, we propose a novel programming model named MapUpdate, which can determine whether a distributed dataset is mutable or immutable in an application. We show that MapUpdate not only offers good expressiveness, but also allows us to enjoy the benefits of both mutable and immutable abstractions. MapUpdate naturally supports iterative and asynchronous execution, and can use different recovery strategies adaptively according to failure scenarios. We implemented MapUpdate in a system, called Tangram, with novel system designs such as lightweight local task management, partition-based progress control, and context-aware failure recovery. Extensive experiments verified the benefits of Tangram on a variety of workloads including bulk processing, graph analytics, and iterative machine learning.

Speakers
YH

Yuzhen Huang

The Chinese University of Hong Kong
XY

Xiao Yan

The Chinese University of Hong Kong
GJ

Guanxian Jiang

The Chinese University of Hong Kong
TJ

Tatiana Jin

The Chinese University of Hong Kong
JC

James Cheng

The Chinese University of Hong Kong
AX

An Xu

The Chinese University of Hong Kong
ZL

Zhanhao Liu

The Chinese University of Hong Kong
ST

Shuo Tu

The Chinese University of Hong Kong


Wednesday July 10, 2019 2:40pm - 3:00pm PDT
USENIX ATC Track II: Grand Ballroom VII–IX