BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160905Z
LOCATION:C146
DTSTART;TZID=America/Chicago:20181112T133000
DTEND;TZID=America/Chicago:20181112T170000
UID:submissions.supercomputing.org_SC18_sess243_tut174@linklings.com
SUMMARY:Exploiting HPC Technologies for Accelerating Big Data Processing a
 nd Associated Deep Learning
DESCRIPTION:Tutorial\nArchitectures, Data Analytics, Deep Learning, I/O, M
 achine Learning, Networks, Programming Systems, Tools, Tutorial Reg Pass\n
 \nExploiting HPC Technologies for Accelerating Big Data Processing and Ass
 ociated Deep Learning\n\nPanda, Lu, Gugnani\n\nThe convergence of HPC, Big
  Data, and Deep Learning is the next game-changing business opportunity. A
 pache Hadoop, Spark, gRPC/TensorFlow, and Memcached are becoming standard 
 building blocks for Big Data processing.  Recent studies have shown that d
 efault designs of these components cannot efficiently leverage the feature
 s of modern HPC clusters, like RDMA-enabled high-performance interconnects
 , high-throughput parallel storage systems (e.g. Lustre), Non-Volatile Mem
 ory (NVM), NVMe/NVMe-over-Fabric.  This tutorial will provide an in-depth 
 overview of the architecture of Hadoop, Spark, gRPC/TensorFlow, and Memcac
 hed. We will examine the challenges in re-designing networking and I/O com
 ponents of these middleware with modern interconnects and storage architec
 tures.  Using the publicly available software packages in the High-Perform
 ance Big Data project (HiBD, http://hibd.cse.ohio-state.edu), we will prov
 ide case studies of the new designs for several Hadoop/Spark/gRPC/TensorFl
 ow/Memcached components and their associated benefits. Through these, we w
 ill also examine the interplay between high-performance interconnects, sto
 rage, and multi-core platforms to achieve the best solutions for these com
 ponents and applications on modern HPC clusters. We also present in-depth 
 case-studies with modern Deep Learning tools (e.g., Caffe, TensorFlow, CNT
 K, BigDL) with RDMA-enabled Hadoop, Spark, and gRPC. Finally, hands-on exe
 rcises will be carried out with RDMA-Hadoop and RDMA-Spark software stacks
  over a cutting-edge HPC cluster.
URL:https://sc18.supercomputing.org/presentation/?id=tut174&sess=sess243
END:VEVENT
END:VCALENDAR