Skip to Main Content
The National Academies of Sciences, Engineering and Medicine
Board on Mathematical Sciences and Analytics
Board on Mathematical Sciences and Analytics
About BMSA
Committee On Applied & Theoretical Statistics
Math Frontiers Webinar Series
Data Education Roundtable
Member Bios
BMSA & CATS Impacts

Roundtable on Data Science Postsecondary Education Meeting #6:

Improving Reproducibility by Teaching Data Science as a Scientific Process

March 23, 2018

Hotel Shattuck Plaza
2086 Alston Way, Crystal Ballroom Section 2
Berkeley, CA 94704




The National Academies of Sciences, Engineering, and Medicine will hold a one-day meeting and webcast on "Improving Reproducibility by Teaching Data Science as a Scientific Process" on March 23, 2018. The meeting will bring together data scientists and educators in academia and industry to 1) discuss how data science can help understand and improve reproducibility of scientific research, and 2) learn about several courses and training offerings for reproducible data science.

This event is the sixth of an ongoing series of Roundtable meetings that take place approximately four times per year. This roundtable was organized by the Committee on Applied and Theoretical Statistics in conjunction with the Board on Mathematical Sciences and Their Applications, the Computer Science and Telecommunications Board, and the Board on Science Education.  


Download the final agenda


Complete video playlist


Meeting #6 Highlights


Meeting Agenda

Friday, March 23
Improving Reproducibility by Teaching Data Science as a Scientific Process

9 a.m.  Welcome, new members, and introduction to the day
Eric Kolaczyk, Boston University
Kathy McKeown, Columbia University


9:15 a.m. Data Science as a Science: Methods and Tools at the Intersection of Data Science and
Victoria Stodden, University of Illinois, Urbana-Champaign

Video -- Presentation

9:45 a.m. Teaching Reproducible Data Science: Lessons Learned from a Course at Berkeley
Fernando Perez, University of California, Berkeley

Video -- Presentation

10:15 a.m. Break

10:35 a.m. Reproducible Machine Learning—The Team Data Science Process
Buck Woody, Microsoft Research

Video -- Presentation

11:05 a.m. Group discussion of morning presentations



11:45 a.m. Lunch

12:45 a.m. Training as a Pathway to Improve Reproducibility
Tracy Teal, Data Carpentry

Video -- Presentation

1:15 p.m . Rigor, Reproducibility, and Transparency Training in Biomedical Research
Alison Gammie, National Institute of General Medical Sciences

Video -- Presentation

1:45 p.m. Buried in Data, Starving for Information: How Measurement Noise is Blocking Scientific
Timothy Gardner, Riffyn

Video -- Presentation

2:15 p.m. Break

2:30 p.m. Group discussion of afternoon presentations


3 p.m. Begin breakout group discussions

3:40 p.m. Report back of breakout group discussions and closing

4:05 p.m. Adjourn meeting