Loading…

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

HotCloud \'19 [clear filter]
Monday, July 8
 

9:00am PDT

Opening Remarks
Speakers
CD

Christina Delimitrou

Cornell University
DR

Dan R. K. Ports

Microsoft Research


Monday July 8, 2019 9:00am - 9:15am PDT
HotCloud: Grand Ballroom VII–IX

9:15am PDT

The Seven Sins of Personal-Data Processing Systems under GDPR
In recent years, our society is being plagued by unprecedented levels of privacy and security breaches. To rein in this trend, the European Union, in 2018, introduced a comprehensive legislation called the General Data Protection Regulation (GDPR). In this paper, we review GDPR from a system design perspective, and identify how its regulations conflict with the design, architecture, and operation of modern systems. We illustrate these conflicts via the seven GDPR sins: storing data forever; reusing data indiscriminately; walled gardens and black markets; risk-agnostic data processing; hiding data breaches; making unexplainable decisions; treating security as a secondary goal. Our findings reveal a deep-rooted tussle between GDPR requirements and how modern systems have evolved. We believe that achieving compliance requires comprehensive, grounds up solutions, and anything short would amount to fixing a leaky faucet in a sinking ship.

Speakers
SS

Supreeth Shastri

University of Texas at Austin
MW

Melissa Wasserman

University of Texas at Austin
VC

Vijay Chidambaram

University of Texas at Austin


Monday July 8, 2019 9:15am - 9:30am PDT
HotCloud: Grand Ballroom VII–IX

9:30am PDT

NetWarden: Mitigating Network Covert Channels without Performance Loss
Network covert channels are an advanced threat to the security and privacy of cloud systems. One common limitation of existing defenses is that they all come at the cost of performance. This presents significant barriers to their practical deployment in high-speed networks. We sketch the design of NetWarden, a novel defense whose key design goal is to preserve TCP performance while mitigating covert channels. The use of programmable data planes makes it possible for NetWarden to adapt defenses that were only demonstrated before as proof of concept, and apply them at linespeed. Moreover, NetWarden uses a set of performance boosting techniques to temporarily increase the performance of connections that have been affected by channel mitigation, with the ultimate goal of neutralizing its impact on performance. Our simulation provides initial evidence that NetWarden can mitigate several covert channels with little performance disturbance. As ongoing work, we are working on a full system design and implementation of NetWarden.

Speakers
JX

Jiarong Xing

Rice University
AM

Adam Morrison

Rice University
AC

Ang Chen

Rice University


Monday July 8, 2019 9:30am - 9:45am PDT
HotCloud: Grand Ballroom VII–IX

9:45am PDT

A Double-Edged Sword: Security Threats and Opportunities in One-Sided Network Communication
One-sided network communication technologies such as RDMA and NVMe-over-Fabrics are quickly gaining adoption in production software and in datacenters. Although appealing for their low CPU utilization and good performance, they raise new security concerns that could seriously undermine datacenter software systems building on top of them. At the same time, they offer unique opportunities to help enhance security. Indeed, one-sided network communication is a double-edged sword in security. This paper presents our insights into security implications and opportunities of one-sided communication.

Speakers
ST

Shin-Yeh Tsai

Purdue University
YZ

Yiying Zhang

Purdue University


Monday July 8, 2019 9:45am - 10:00am PDT
HotCloud: Grand Ballroom VII–IX

10:00am PDT

DynaShield: Reducing the Cost of DDoS Defense using Cloud Services
Fueled by IoT botnets and DDoS-as-a-Service tools, distributed denial of service (DDoS) attacks have reached record high volumes. Although there exist DDoS protection services, they can be costly for small organizations as well as individual users. In this paper, we present a low-cost DDoS solution, DynaShield, which a user can deploy at common cloud service providers. DynaShield employs three techniques to reduce cost. First, it uses an on-demand model. A server dynamically updates its DNS record to redirect clients’ traffic to DynaShield when it is under attack, avoiding paying for cloud services during peacetime. Second, DynaShield combines serverless functions and elastic servers provided by cloud providers to auto-scale to large attacks without overprovisioning. Third, DynaShield uses cryptocurrency puzzles as proof of work. The coin mining profit can further offset a protected server’s cloud service charges. Our preliminary evaluation suggests that DynaShield can cost as little as a few dollars per month to prevent an organization from common DDoS attacks.

Speakers
SZ

Shengbao Zheng

Duke University
XY

Xiaowei Yang

Duke University


Monday July 8, 2019 10:00am - 10:15am PDT
HotCloud: Grand Ballroom VII–IX

10:15am PDT

Deep Enforcement: Policy-based Data Transformations for Data in the Cloud
Despite the growing collection and use of private data in the cloud, there remains a fundamental disconnect between unified data governance and the storage system enforcement techniques. On one side, high-level governance policies derived from regulations like General Data Protection Regulation (GDPR) have emerged with stricter rules dictating who, when and how data can be processed. On the other side, storage-level controls, both role- or attribute-based, continue to focus on access/deny enforcement. In this paper, we propose how to bridge this gap. We introduce Deep Enforcement, a system that provides unified governance and transformation policies coupled with data transformations embedded into the storage fabric to achieve policy compliance. Data transformations can vary in complexity, from simple redactions to complex differential privacy-based techniques to provide the required amount of anonymization. We show how this architecture can be implemented into two broad classes of data storage systems in the cloud: object storages and SQL databases. Depending on the complexity of the transformation, we also demonstrate how to implement them either in-line (on data access) or off-line (creating an alternate cached dataset).

Speakers
EK

Ety Khaitzin

IBM Research
MA

Maya Anderson

IBM Research
HJ

Hani Jamjoom

IBM Research
RK

Ronen Kat

IBM Research
AN

Arjun Natarajan

IBM Research
RR

Roger Raphael

IBM Cloud
RS

Roee Shlomo

IBM Research
TS

Tomer Solomon

IBM Research


Monday July 8, 2019 10:15am - 10:30am PDT
HotCloud: Grand Ballroom VII–IX

10:30am PDT

Break with Refreshments
Monday July 8, 2019 10:30am - 11:00am PDT
Grand Ballroom Prefunction

11:00am PDT

Jumpgate: In-Network Processing as a Service for Data Analytics
In-network processing, where data is processed by special-purpose devices as it passes over the network, is showing great promise at improving application performance, in particular for data analytics tasks. However, analytics and in-network processing are not yet integrated and widely deployed. This paper presents a vision for providing in-network processing as a service to data analytics frameworks, and outlines benefits, remaining challenges, and our current research directions towards realizing this vision.

Speakers
CM

Craig Mustard

University of British Columbia
FR

Fabian Ruffy

University of British Columbia
AG

Anny Gakhokidze

University of British Columbia
IB

Ivan Beschastnikh

University of British Columbia
AF

Alexandra Fedorova

University of British Columbia


Monday July 8, 2019 11:00am - 11:15am PDT
HotCloud: Grand Ballroom VII–IX

11:15am PDT

Just In Time Delivery: Leveraging Operating Systems Knowledge for Better Datacenter Congestion Control
Network links and server CPUs are heavily contended resources in modern datacenters. To keep tail latencies low, datacenter operators drastically overprovision both types of resources today, and there has been significant research into effectively managing network traffic and CPU load. However, this work typically looks at the two resources in isolation.

In this paper, we make the observation that, in the datacenter, the allocation of network and CPU resources should be co-designed for the most efficiency and the best response times. For example, while congestion control protocols can prioritize traffic from certain flows, this provides no benefit if the traffic arrives at an overloaded server that will only queue the request.

This paper explores the potential benefits of such a co-designed resource allocator and considers the recent work in both CPU scheduling and congestion control that is best suited to such a system. We propose a Chimera, a new datacenter OS that integrates a receiver-based congestion control protocol with OS insight into application queues, using the recent Shenango operating system.

Speakers
AB

Adam Belay

MIT CSAIL
IZ

Irene Zhang

Microsoft Research


Monday July 8, 2019 11:15am - 11:30am PDT
HotCloud: Grand Ballroom VII–IX

11:30am PDT

Designing an Efficient Replicated Log Store with Consensus Protocol
Highly available and high-performance message logging system is critical building block for various use cases that require global ordering, especially for deterministic distributed transactions. To achieve availability, we maintain multiple replicas that have the same payloads in exactly the same order. This introduces various challenging issues such as consistency between replicas after failure, while minimizing performance degradation. Replicated state machine-based consensus protocols are the most suitable candidates to fulfill those requirements, but double-write problem and different logging granularity make it hard to keep the system efficient. This paper suggests a novel way to build a replicated log store on top of Raft consensus protocol, aiming at providing the same level of consistency as well as fault-tolerance without sacrificing the throughput of the system.

Speakers
JA

Jung-Sang Ahn

eBay Inc.
WK

Woon-Hak Kang

eBay Inc.
KR

Kun Ren

eBay Inc.
GZ

Guogen Zhang

eBay Inc.


Monday July 8, 2019 11:30am - 11:45am PDT
HotCloud: Grand Ballroom VII–IX

11:45am PDT

Tackling Parallelization Challenges of Kernel Network Stack for Container Overlay Networks
Overlay networks are the de facto networking technique for providing flexible, customized connectivity among distributed containers in the cloud. However, overlay networks also incur non-trivial overhead due to its complexity, resulting in significant network performance degradation of containers. In this paper, we perform a comprehensive empirical performance study of container overlay networks which identifies unrevealed, important parallelization bottlenecks of the kernel network stack that prevent container overlay networks from scaling. Our observations and root cause analysis cast light on optimizing the network stack of modern operating systems on multi-core systems to more efficiently support container overlay networks in light of high-speed network devices.

Speakers
JL

Jiaxin Lei

SUNY at Binghamton
KS

Kun Suo

The University of Texas at Arlington
HL

Hui Lu

SUNY at Binghamton
JR

Jia Rao

The University of Texas at Arlington


Monday July 8, 2019 11:45am - 12:00pm PDT
HotCloud: Grand Ballroom VII–IX

12:00pm PDT

A Black-Box Approach for Estimating Utilization of Polled IO Network Functions
Cloud management tasks such as performance diagnosis, workload placement, and power management depend critically on estimating the utilization of an application. But, it is challenging to measure actual utilization for polled IO network functions (NFs) without code instrumentation. We ask if CPU events (e.g., data cache misses) measured using hardware performance counters are good at estimating utilization for polled-IO NFs. We find a strong correlation between several CPU events and NF utilization for three representative types of network functions. Inspired by this finding, we explore the possibility of computing a universal estimation function that maps selected CPU events to NF utilization estimates for a wide-range of NFs, traffic profiles and traffic loads. Our NF-specific estimators and universal estimators achieve absolute estimation errors below 6% and 10% respectively.

Speakers
HG

Harshit Gupta

Georgia Institute of Technology
AS

Abhigyan Sharma

AT&T Labs Research
AZ

Alex Zelezniak

AT&T Labs Research
MJ

Minsung Jang

AT&T Labs Research
UR

Umakishore Ramachandran

Georgia Institute of Technology


Monday July 8, 2019 12:00pm - 12:15pm PDT
HotCloud: Grand Ballroom VII–IX

12:15pm PDT

Luncheon for Workshop Attendees
Monday July 8, 2019 12:15pm - 2:00pm PDT
Olympic Pavilion

2:00pm PDT

DLion: Decentralized Distributed Deep Learning in Micro-Clouds
Deep learning is a popular technique for building inference models and classifiers from large quantities of input data for applications in many domains. With the proliferation of edge devices such as sensor and mobile devices, large volumes of data are generated at rapid pace all over the world. Migrating large amounts of data into centralized data center(s) over WAN environments is often infeasible due to cost, performance or privacy reasons. Moreover, there is an increasing need for incremental or online deep learning over newly generated data in real-time. These trends require rethinking of the traditional training approach to deep learning. To handle the computation on distributed input data, micro-clouds, small-scale clouds deployed near edge devices in many different locations, provide an attractive alternative for data locality reasons. However, existing distributed deep learning systems do not support training in micro-clouds, due to the unique characteristics and challenges in this environment. In this paper, we examine the key challenges of deep learning in micro-clouds: computation and network resource heterogeneity at inter- and intra micro-cloud levels and their scale. We present DLion, a decentralized distributed deep learning system for such environments. It employs techniques specifically designed to address the above challenges to reduce training time, enhance model accuracy, and provide system scalability. We have implemented a prototype of DLion in TensorFlow and our preliminary experiments show promising results towards achieving accurate and efficient distributed deep learning in micro-clouds.

Speakers
RH

Rankyung Hong

University of Minnesota
AC

Abhishek Chandra

University of Minnesota


Monday July 8, 2019 2:00pm - 2:15pm PDT
HotCloud: Grand Ballroom VII–IX

2:15pm PDT

The Case for Unifying Data Loading in Machine Learning Clusters
Training machine learning models involves iteratively fetching and pre-processing batches of data. Conventionally, popular ML frameworks implement data loading within a job and focus on improving the performance of a single job. However, such an approach is inefficient in shared clusters where multiple training jobs are likely to be accessing the same data and duplicating operations. To illustrate this, we present a case study which reveals that for hyper-parameter tuning experiments we can reduce up to 89% I/O and 97% pre-processing redundancy.

Based on this observation, we make the case for unifying data loading in machine learning clusters by bringing the isolated data loading systems together into a single system. Such a system architecture can remove the aforementioned redundancies that arise due to the isolation of data loading in each job. We introduce OneAccess, a unified data access layer and present a prototype implementation that shows a 47.3% improvement in I/O cost when sharing data across jobs. Finally we discuss open research challenges in designing and developing a unified data loading layer that can run across frameworks on shared multi-tenant clusters, including how to handle distributed data access, support diverse sampling schemes, and exploit new storage media.

Speakers
AK

Aarati Kakaraparthy

University of Wisconsin, Madison & Microsoft Gray Systems Lab, Madison
AV

Abhay Venkatesh

University of Wisconsin, Madison
AP

Amar Phanishayee

Microsoft Research
SV

Shivaram Venkataraman

University of Wisconsin and Microsoft Research


Monday July 8, 2019 2:15pm - 2:30pm PDT
HotCloud: Grand Ballroom VII–IX

2:30pm PDT

Accelerating Deep Learning Inference via Freezing
Over the last few years, Deep Neural Networks (DNNs) have become ubiquitous owing to their high accuracy on real-world tasks. However, this increase in accuracy comes at the cost of computationally expensive models leading to higher prediction latencies. Prior efforts to reduce this latency such as quantization, model distillation, and any-time prediction models typically trade-off accuracy for performance. In this work, we observe that caching intermediate layer outputs can help us avoid running all the layers of a DNN for a sizeable fraction of inference requests. We find that this can potentially reduce the number of effective layers by half for 91.58% of CIFAR-10 requests run on ResNet-18. We present Freeze Inference, a system that introduces approximate caching at each intermediate layer and we discuss techniques to reduce the cache size and improve the cache hit rate. Finally, we discuss some of the open research challenges in realizing such a design.

Speakers
AK

Adarsh Kumar

University of Wisconsin, Madison
AB

Arjun Balasubramanian

University of Wisconsin, Madison
SV

Shivaram Venkataraman

University of Wisconsin and Microsoft Research
AA

Aditya Akella

University of Wisconsin - Madison


Monday July 8, 2019 2:30pm - 2:45pm PDT
HotCloud: Grand Ballroom VII–IX

2:45pm PDT

Characterization and Prediction of Performance Interference on Mediated Passthrough GPUs for Interference-aware Scheduler
Sharing GPUs in the cloud is cost effective and can facilitate the adoption of hardware accelerator enabled cloud. Butsharing causes interference between co-located VMs andleads to performance degradation. In this paper, we proposedan interference-aware VM scheduler at the cluster level withthe goal of minimizing interference. NVIDIA vGPU pro-vides sharing capability and high performance, but it has unique performance characteristics, which have not been studied thoroughly before. Our study reveals several key ob-servations. We leverage our observations to construct modelsbased on machine learning techniques to predict interferencebetween co-located VMs on the same GPU. We proposed a system architecture leveraging our models to schedule VMs to minimize the interference. The experiments show that our observations improves the model accuracy (by 15% ̃ 40%) and the scheduler reduces application run-time overhead by 24.2% in simulated scenarios.

Speakers
XX

Xin Xu

VMware Inc
NZ

Na Zhang

VMware Inc
MC

Michael Cui

VMware Inc
MH

Michael He

The University of Texas at Austin
RS

Ridhi Surana

VMware Inc


Monday July 8, 2019 2:45pm - 3:00pm PDT
HotCloud: Grand Ballroom VII–IX

3:00pm PDT

Happiness index: Right-sizing the cloud’s tenant-provider interface
Cloud providers and their tenants have a mutual interest in identifying optimal configurations in which to run tenant jobs, i.e., ones that achieve tenants' performance goals at minimum cost; or ones that maximize performance within a specified budget. However, different tenants may have different performance goals that are opaque to the provider. A consequence of this opacity is that providers today typically offer fixed bundles of cloud resources, which tenants must themselves explore and choose from. This is burdensome for tenants and can lead to choices that are sub-optimal for both parties.

We thus explore a simple, minimal interface, which lets tenants communicate their happiness with cloud infrastructure to the provider, and enables the provider to explore resource configurations that maximize this happiness. Our early results indicate that this interface could strike a good balance between enabling efficient discovery of application resource needs and the complexity of communicating a full description of tenant utility from different configurations to the provider.

Speakers
VD

Vojislav Dukic

ETH Zurich
AS

Ankit Singla

ETH Zurich


Monday July 8, 2019 3:00pm - 3:15pm PDT
HotCloud: Grand Ballroom VII–IX

3:15pm PDT

The True Cost of Containing: A gVisor Case Study
We analyze many facets of the performance of gVisor, a new security-oriented container engine that integrates with Docker and backs Google’s serverless platform. We explore the effect gVisor’s in-Sentry network stack has on network throughput as well as the overheads of performing all file opens via gVisor’s Gofer service. We further analyze gVisor startup performance, memory efficiency, and system-call overheads. Our findings have implications for the future design of similar hypervisor- based container engines.

Speakers
EG

Ethan G. Young

University of Wisconsin-Madison
PZ

Pengfei Zhu

University of Wisconsin-Madison
TC

Tyler Caraza-Harter

University of Wisconsin-Madison
AC

Andrea C. Arpaci-Dusseau

University of Wisconsin, Madison
RH

Remzi H. Arpaci-Dusseau

University of Wisconsin, Madison


Monday July 8, 2019 3:15pm - 3:30pm PDT
HotCloud: Grand Ballroom VII–IX

3:30pm PDT

Carving Perfect Layers out of Docker Images
This paper and abstract are under embargo and will be released to the public on July 8, 2019.

Speakers
DS

Dimitris Skourtis

IBM Research
LR

Lukas Rupprecht

IBM Research
VT

Vasily Tarasov

IBM Research-Almaden
NM

Nimrod Megiddo

IBM Research


Monday July 8, 2019 3:30pm - 3:45pm PDT
HotCloud: Grand Ballroom VII–IX

3:45pm PDT

4:15pm PDT

Bridging the Edge-Cloud Barrier for Real-time Advanced Vision Analytics
Advanced vision analytics plays a key role in a plethora of real-world applications. Unfortunately, many of these applications fail to leverage the abundant compute resource in cloud services, because they require high computing resources {\em and} high-quality video input, but the (wireless) network connections between visual sensors (cameras) and the cloud/edge servers do not always provide sufficient and stable bandwidth to stream high-fidelity video data in real time.

This paper presents CloudSeg, an edge-to-cloud framework for advanced vision analytics that co-designs the cloud-side inference with real-time video streaming, to achieve both low latency and high inference accuracy. The core idea is to send the video stream in low resolution, but recover the high-resolution frames from the low-resolution stream via a {\em super-resolution} procedure tailored for the actual analytics tasks. In essence, CloudSeg trades additional cloud-side computation (super-resolution) for significantly reduced network bandwidth. Our initial evaluation shows that compared to previous work, CloudSeg can reduce bandwidth consumption by $\sim$6.8$\times$ with negligible drop in accuracy.

Speakers
JJ

Junchen Jiang

University of Chicago


Monday July 8, 2019 4:15pm - 4:30pm PDT
HotCloud: Grand Ballroom VII–IX

4:30pm PDT

NLUBroker: A Flexible and Responsive Broker for Cloud-based Natural Language Understanding Services
Cloud-based Natural Language Understanding (NLU) services are getting more and more popular with the development of artificial intelligence. More applications are integrated with cloud-based NLU services to enhance the way people communicate with machines. However, with NLU services provided by different companies powered by unrevealed AI technology, how to choose the best one is a problem for users. To our knowledge, there is currently no platform that can provide guidance to users and make recommendations based on their needs. To fill this gap, in this paper, we propose NLUBroker, a platform to comprehensively measure the performance indicators of candidate NLU services, and further provide a broker to select the most suitable service according to the different needs of users. Our evaluation shows that different NLU services leading in different aspects, and NLUBroker is able to improve the quality of experience by automatically choosing the best service. In addition, reinforcement learning is used to support NLUBroker by an intelligent agent in a dynamic environment, and the results are promising.

Speakers
LX

Lanyu Xu

Wayne State University
AI

Arun Iyengar

IBM T.J. Watson Research Center
WS

Weisong Shi

Wayne State University


Monday July 8, 2019 4:30pm - 4:45pm PDT
HotCloud: Grand Ballroom VII–IX

4:45pm PDT

Static Call Graph Construction in AWS Lambda Serverless Applications
We present new means for performing static program analysis on serverless programs. We propose a new type of call graph that captures the stateless, event-driven nature of such programs and describe a method for constructing these new extended service call graphs. Next, we survey applications of program analysis that can leverage our extended service call graphs to answer questions about code that executes on a serverless platform. We present findings on the applicability of our techniques to real open source serverless programs. Finally, we close with several open questions about how to best incorporate static analysis in problem solving for developing serverless applications.

Speakers
MO

Matthew Obetz

Rensselaer Polytechnic Institute
SP

Stacy Patterson

Rensselaer Polytechnic Institute
AM

Ana Milanova

Rensselaer Polytechnic Institute


Monday July 8, 2019 4:45pm - 5:00pm PDT
HotCloud: Grand Ballroom VII–IX

5:00pm PDT

Agile Cold Starts for Scalable Serverless
The Serverless or Function-as-a-Service (FaaS) model capitalizes on lightweight execution by packaging code and dependencies together for just-in-time dispatch. Often a container environment has to be set up afresh– a condition called “cold start", and in such cases, performance suffers and overheads mount, both deteriorating rapidly under high concurrency. Caching and reusing previously employed containers ties up memory and risks information leakage. Latency for cold starts is frequently due to work and wait-times in setting up various dependencies – such as in initializing networking elements. This paper proposes a solution that pre-crafts such resources and then dynamically re-associates them with baseline containers. Applied to networking, this approach demonstrates an order of magnitude gain in cold starts, negligible memory consumption, and flat startup time under rising concurrency.

Speakers
AM

Anup Mohan

Intel Corporation
HS

Harshad Sane

Intel Corporation
KD

Kshitij Doshi

Intel Corporation
SE

Saikrishna Edupuganti

Intel Corporation
NN

Naren Nayak

Intel Corporation


Monday July 8, 2019 5:00pm - 5:15pm PDT
HotCloud: Grand Ballroom VII–IX

5:15pm PDT

Co-evolving Tracing and Fault Injection with Box of Pain
Distributed systems are hard to reason about largely because of uncertainty about what may go wrong in a particular execution, and about whether the system will mitigate those faults. Tools that perturb executions can help test whether a system is robust to faults, while tools that observe executions can help better understand their system-wide effects. We present Box of Pain, a tracer and fault injector for unmodified distributed systems that addresses both concerns by interposing at the system call level and dynamically reconstructing the partial order of communication events based on causal relationships. Box of Pain’s lightweight approach to tracing and focus on simulating the effects of partial failures on communication rather than the failures themselves sets it apart from other tracing and fault injection systems. We present evidence of the promise of Box of Pain and its approach to lightweight observation and perturbation of distributed systems.

Speakers
DB

Daniel Bittman

UC Santa Cruz
EL

Ethan L. Miller

UC Santa Cruz
PA

Peter Alvaro

UC Santa Cruz


Monday July 8, 2019 5:15pm - 5:30pm PDT
HotCloud: Grand Ballroom VII–IX