BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160726Z
LOCATION:D173
DTSTART;TZID=America/Chicago:20181111T145000
DTEND;TZID=America/Chicago:20181111T150000
UID:submissions.supercomputing.org_SC18_sess163_ws_works106@linklings.com
SUMMARY:Optimizing the Throughput of Storm-Based Stream Processing in Clou
 ds
DESCRIPTION:Workshop\nReproducibility, Scientific Computing, Scientific Wo
 rkflows, Workflows, Workshop Reg Pass, HPC, Data Intensive\n\nOptimizing t
 he Throughput of Storm-Based Stream Processing in Clouds\n\nCao, Wu, Bao, 
 Hou\n\nThere is a rapidly growing need for processing large volumes of str
 eaming data in real time in various big data applications. As one of the m
 ost commonly used systems for streaming data processing, Apache Storm prov
 ides a workflow-based mechanism to execute directed acyclic graph (DAG)-st
 ructured topologies. With the expansion of cloud infrastructures around th
 e globe and the economic benefits of cloud-based computing and storage ser
 vices, many such Storm workflows have been shifted or are in active transi
 tion to clouds. However, modeling the behavior of streaming data processin
 g and improving its performance in clouds still remain largely unexplored.
   We construct rigorous cost models to analyze the throughput dynamics of 
 Storm workflows and formulate a budget-constrained topology mapping proble
 m to maximize Storm workflow throughput in clouds. We show this problem to
  be NP-complete and design a heuristic solution that takes into considerat
 ion not only the selection of virtual machine type but also the degree of 
 parallelism for each task (spout/bolt) in the topology. The performance su
 periority of the proposed mapping solution is illustrated through extensiv
 e simulations and further verified by real-life workflow experiments deplo
 yed in public clouds in comparison with the default Storm and other existi
 ng methods.
URL:https://sc18.supercomputing.org/presentation/?id=ws_works106&sess=sess
 163
END:VEVENT
END:VCALENDAR

