BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160728Z
LOCATION:D220
DTSTART;TZID=America/Chicago:20181112T153000
DTEND;TZID=America/Chicago:20181112T160000
UID:submissions.supercomputing.org_SC18_sess172_ws_phpsc107@linklings.com
SUMMARY:Data-Parallel Python for High Energy Physics Analyses
DESCRIPTION:Workshop\nParallel Application Frameworks, Reproducibility, Sc
 ientific Computing, Workshop Reg Pass\n\nData-Parallel Python for High Ene
 rgy Physics Analyses\n\nPaterno, Green, Kowalkowski, Sehrish\n\nIn this pa
 per, we explore features available in Python which are useful for data red
 uction tasks in High Energy Physics (HEP). High-level abstractions in Pyth
 on are convenient for implementing data reduction tasks. However, in order
  for such abstractions to be practical, the efficiency of their performanc
 e must also be high. Because the data sets we process are typically large,
  we care about both I/O performance and in-memory processing speed. In par
 ticular, we evaluate the use of data-parallel programming, using MPI and n
 umpy, to process a large experimental data set (42 TiB) stored in an HDF5 
 file. We measure the speed of processing of the data, distinguishing betwe
 en the time spent reading data and the time spent processing the data in m
 emory, and demonstrate the scalability of both, using up to 1200 KNL nodes
  (76800 cores) on Cori at NERSC.
URL:https://sc18.supercomputing.org/presentation/?id=ws_phpsc107&sess=sess
 172
END:VEVENT
END:VCALENDAR