BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160902Z
LOCATION:D171
DTSTART;TZID=America/Chicago:20181115T133000
DTEND;TZID=America/Chicago:20181115T141500
UID:submissions.supercomputing.org_SC18_sess482_pec293@linklings.com
SUMMARY:The BP Data Science Sandbox
DESCRIPTION:HPC Impact Showcase\nWorkshop Reg Pass, Tutorial Reg Pass, Tec
 h Program Reg Pass, Exhibits Reg Pass, Exhibits - Exhibit Hall Only Reg Pa
 ss, Industry\n\nThe BP Data Science Sandbox\n\nGrossman, Yusifov\n\nRecent
  years have seen major advances in the state-of-the-art of machine learnin
 g, particularly in fields such as natural language processing and 2D compu
 ter vision.\n\nThese advances have naturally spurred interest in the appli
 cation of similar techniques to new fields in medicine, science, and engin
 eering. However, the problems in these fields are differentiated from prev
 ious machine learning successes by the level of domain expertise required.
  While classifying an image as a cat, dog, horse, etc is a task that anyon
 e can understand, automatic identification of malignant tumors, subsurface
  faults, or financial fraud (for example) often requires far more backgrou
 nd in the specific domain. Unfortunately, it is rare today for people to h
 ave both the skills of a data scientist/statistician and a domain expert (
 e.g. an oncologist or petroleum engineer).\nThis problem can generally be 
 solved in two ways: (1) through education (of your data scientists and/or 
 domain experts), or (2) through co-location of these two groups of people 
 such that they can work closely together.\n\nThis talk will introduce the 
 BP Data Science Sandbox (DSS) – an internal environment at BP that support
 s both of the above solutions. The sandbox is a platform made up of hardwa
 re, software, and people. On the hardware front, the sandbox includes ever
 ything from big memory machines to GPU machines to compute clusters, enabl
 ing users of the sandbox to pick and choose the platform that meets their 
 resource requirements. On the software front, the sandbox is built on enti
 rely free and open source software, including common tools such as Jupyter
 , JupyterHub, Spark, Dask, Tensorflow, and other packages in the Conda eco
 system. On the people front, the sandbox is supported by a team of dedicat
 ed data scientists and infrastructure engineers who support users and inte
 rnal customers of the sandbox.
URL:https://sc18.supercomputing.org/presentation/?id=pec293&sess=sess482
END:VEVENT
END:VCALENDAR

