Data Science ARES: Fernando Pérez
Join us at the Data Science Applied Research and Education Seminar (ARES) with:
Dr. Fernando Pérez
Associate Professor in Statistics
Faculty Scientist, Data Science and Technology Division, Lawrence Berkeley National Laboratory
Berkeley Institute for Data Science Faculty Affiliate
University of California, Berkeley
Free Event | Registration Required
Talk Title: Reproducibility and open science with the Jupyter ecosystem: from research to teaching
Abstract: I will discuss how today’s open source ecosystem allows for research and educational practices that can make science more open, collaborative and reproducible. I teach a course at UC Berkeley on this topic, and I have used the course as a space to experiment with tools and practices to make those very ideas possible: I work in the same environment as the students, in a cloud-native environment with version-controlled configuration and home directories, Github Classroom for all individual and group assignments, tools for data sharing and synchronization, virtual desktop support and more. I also use a similar setup for a cloud-hosted private deployment of Jupyter tools that I use for my research group and collaborators. And finally, these tools match what is used at national HPC facilities where I conduct some of my research.
By having a uniform toolkit based on open tools and practices, we can now move fluidly between local installations, the cloud and HPC, enhancing collaboration and reproducibility. I will discuss both my current experiences and some ideas on how we can continue
building this vision for more effective workflows in the future.
Speaker Profile: Fernando Pérez (@fperez_org) is an Associate Professor in Statistics at UC Berkeley and scientist at LBNL. He builds open source tools for humans to use computers as companions in thinking and collaboration, mostly in the scientific Python ecosystem (IPython, Jupyter & friends). A computational physicist by training, his research interests include questions at the nexus of software and geoscience, seeking to build the computational and data ecosystem to tackle problems like climate change with collaborative, open, reproducible, and extensible scientific practices. He is a co-founder of Project Jupyter, the 2i2c.org initiative, the Berkeley Institute for Data Science and the NumFOCUS Foundation. He is a recipient of the 2017 ACM Software System Award and the 2012 FSF Award for the Advancement of Free Software.