BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160730Z
LOCATION:C140/142
DTSTART;TZID=America/Chicago:20181114T103000
DTEND;TZID=America/Chicago:20181114T110000
UID:submissions.supercomputing.org_SC18_sess188_pap504@linklings.com
SUMMARY:Cooperative Rendezvous Protocols for Improved Performance and Over
 lap
DESCRIPTION:Paper\nArchitectures, MPI, Networks, Performance, Programming 
 Systems, State of the Practice, Tech Program Reg Pass, BSP Finalist\n\nCoo
 perative Rendezvous Protocols for Improved Performance and Overlap\n\nChak
 raborty, Bayatpour, Hashmi, Subramoni, Panda\n\nWith the emergence of larg
 er multi-/many-core clusters, performance of large message communication i
 s becoming more important. MPI libraries use different Rendezvous protocol
 s to perform large message communication. However, existing Rendezvous pro
 tocols do not consider the overall communication pattern and make optimal 
 use of the Sender and the Receiver CPUs. In this work, we propose a cooper
 ative Rendezvous protocol that can provide up to 2x improvement in intra-n
 ode bandwidth and latency for large messages. We also propose a scheme to 
 dynamically choose the best Rendezvous protocol for each message based on 
 the communication pattern.  Finally, we show how these improvements can in
 crease the overlap of computation with intra-node and inter-node communica
 tion, and lead to application level benefits. We evaluate proposed designs
  on three different architectures including Intel Xeon, Knights Landing, a
 nd OpenPOWER with different HPC applications and obtain benefits up to 19%
  with Graph500, 16% with CoMD, and 10% with MiniGhost.
URL:https://sc18.supercomputing.org/presentation/?id=pap504&sess=sess188
END:VEVENT
END:VCALENDAR

