BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160728Z
LOCATION:D165
DTSTART;TZID=America/Chicago:20181112T170000
DTEND;TZID=America/Chicago:20181112T173000
UID:submissions.supercomputing.org_SC18_sess161_ws_pmbsf122@linklings.com
SUMMARY:Automated Instruction Stream Throughput Prediction for Intel and A
 MD Microarchitectures
DESCRIPTION:Workshop\nBenchmarks, Parallel Programming Languages, Librarie
 s, and Models, Performance, Simulation, Workshop Reg Pass\n\nAutomated Ins
 truction Stream Throughput Prediction for Intel and AMD Microarchitectures
 \n\nLaukemann, Hammer, Hofmann, Hager, Wellein\n\nAn accurate prediction o
 f scheduling and execution of instruction streams is a necessary prerequis
 ite for predicting the in-core performance behavior of throughput-bound lo
 op kernels on out-of-order processor architectures. Such predictions are a
 n indispensable component of analytical performance models, such as the Ro
 ofline and the Execution-Cache-Memory (ECM) model, and allow a deep unders
 tanding of the performance-relevant interactions between hardware architec
 ture and loop code.\n\nWe present the Open Source Architecture Code Analyz
 er (OSACA), a static analysis tool for predicting the execution time of se
 quential loops comprising x86 instructions under the assumption of an infi
 nite first-level cache and perfect out-of-order scheduling. We show the pr
 ocess of building a machine model from available documentation and semi-au
 tomatic benchmarking, and carry it out for the latest Intel Skylake and AM
 D Zen micro-architectures.\n\nTo validate the constructed models, we apply
  them to several assembly kernels and compare runtime predictions with act
 ual measurements. Finally we give an outlook on how the method may be gene
 ralized to new architectures.
URL:https://sc18.supercomputing.org/presentation/?id=ws_pmbsf122&sess=sess
 161
END:VEVENT
END:VCALENDAR