2023 Workshop on Reproducibility

CANSSI Ontario and the Data Sciences Institute at the University of Toronto are excited to host the Toronto Workshop on Reproducibility in February 2023. This two-day workshop brings together academic and industry participants on the critical issue of reproducibility in applied statistics and related areas.


Hourly Schedule

Wednesday, February 22, 2023.

08:30 - 17:15
Toronto Replication Games.
Participants will be matched with other researchers working in the same field (e.g., economics, American Politics). Each team will work on replicating a recently published study in a leading econ/poli sci journal.

Interested researchers and teams should contact Abel Brodeur (abrodeur@uottawa.ca).

Thursday, February 23, 2023.

08:50 - 17:15
Workshop on Reproducibility.
This hybrid workshop is free and open to all.

The Workshop has three broad focus areas:

  1. Evaluating reproducibility: Systematically looking at the extent of reproducibility of a paper or even in a whole field is important to understand where weaknesses exist. Does, say, economics fall flat while demography shines? How should we approach these reproductions? What aspects contribute to the extent of reproducibility.

  2. Practices of reproducibility: We need new tools and approaches that encourage us to think more deeply about reproducibility and integrate it into everyday practice.

  3. Teaching reproducibility: While it is probably too late for most of us, how can we ensure that today’s students don’t repeat our mistakes? What are some case studies that show promise? How can we ensure this doesn’t happen again?
08:50 - 09:00
Opening Remarks.
Rohan Alexander, University of Toronto.
09:00 - 09:15
Reproducible Teaching in Statistics and Data Science Curricula.
Mine Dogucu, University College London & University of California Irvine.

Teaching reproducibility.

Abstract: In reproducibility, we often focus on 1) reproducible research practices and 2) teaching these practices to students. In this talk, I will talk about a third dimension of reproducibility: reproducible teaching. Instructors use tools and adopt practices in preparing their teaching materials. I will discuss how reproducibility relates to these tools and practices. I will share examples from my statistics and data science courses and make recommendations based on teaching experiences.
09:15 - 09:30
Reproducible Student Project Reports with Python + Quarto.
Debbie Yuster, Ramapo College of New Jersey.

Teaching reproducibility.

Abstract: TBC.
09:30 - 09:45
Moon and suicide - a case in point example of debunking a likely false positive finding.
Martin Plöderl, Paracelsus Medical University, Salzburg, Austria.

Practices of reproducibility.

Abstract: TBC.
09:45 - 10:00
Audience Q&A and/or break.
10:00 - 10:15
Code execution during peer review - you can do it, too!
Daniel Nüst, CODECHECK & Reproducible AGILE | TU Dresden.

Evaluating reproducibility.

Abstract: TBC.
10:15 - 10:30
Towards greater standardization of reproducibility: TrovBase approach.
Sam Jordan, TrovBase.

Practices of reproducibility.

Abstract: Research code is difficult to understand and build upon because it isn’t standardized; research pipelines are artisan. TrovBase is a data management platform that standardizes the process from dataset configuration to analysis, and does so in a way that makes sharing analysis (and building upon it) easy and fast. The TrovBase team will discuss how to make graphs and analysis maximally reproducible using TrovBase.
10:30 - 10:45
Sharing the Recipe: Reproducibility and Replicability in Research Across Disciplines.
Rima-Maria Rahal, Max Planck Institute for Research on Collective Goods.

Practices of reproducibility.

Abstract: The open and transparent documentation of scientific processes has been established as a core antecedent of free knowledge. This also holds for generating robust insights in the scope of research projects. To convince academic peers and the public, the research process must be understandable and retraceable (reproducible), and repeatable (replicable) by others, precluding the inclusion of fluke findings into the canon of insights. In this contribution, we outline what reproducibility and replicability (R&R) could mean in the scope of different disciplines and traditions of research and which significance R&R has for generating insights in these fields. We draw on projects conducted in the scope of the Wikimedia "Open Science Fellows Program" (Fellowship Freies Wissen), an interdisciplinary, long-running funding scheme for projects contributing to open research practices. We identify twelve implemented projects from different disciplines which primarily focused on R&R, and multiple additional projects also touching on R&R. From these projects, we identify patterns and synthesize them into a roadmap of how research projects can achieve R&R across different disciplines. We further outline the ground covered by these projects and propose ways forward.
10:45 - 11:00
Audience Q&A and/or break.
11:00 - 11:15
Certifiying reproducibility.
Lars Vilhuber, Cornell University.

Evaluating reproducibility.

Abstract: One of the goals of reproducibility - the basis for all subsequent inquiries - is to assure users of a research compendium that it is complete. How do we do that? We re-run code. But what if the data underlying the compendium is confidential (sensitive)? What if it is transient (Twitter)? What if it is so big that it takes weeks to run? All of the above? I will talk about efforts in designing a way to credibly convey that the compendium has run at least once, and the many questions that might arise around that.
11:15 - 11:30
TBC.
Claudia Solis-Lemus, University of Wisconsin-Madison.

Practices of reproducibility.

Abstract: TBC.
11:30 - 11:45
A Computational Reproducibility Investigation of the Open Data Badge Policy in one Issue of Psychological Science.
Sophia Crüwell, University of Cambridge / Charité Medical University Berlin.

Evaluating reproducibility.

Abstract: TBC.
11:45 - 12:00
Audience Q&A and/or break.
12:00 - 12:15
Jae Hattrick-Simpers panel.
Panelists: TBC.
12:15 - 12:30
Jae Hattrick-Simpers panel.
Panelists: TBC.
12:30 - 12:45
Jae Hattrick-Simpers panel.
Panelists: TBC.
12:45 - 13:00
Audience Q&A and/or break.
13:00 - 13:15
Reproducible Open Science for All.
Yanina Bellini Saibene, rOpenSci.

Practices of reproducibility.

Abstract: Open Source and Open Science are global movements, but there is a dismaying lack of diversity in these communities. Non-English speakers and researchers working from the Global South face a significant barrier to being part of these movements. rOpenSci is carrying out a series of activities and projects to ensure our research software serves everyone in our communities, which means it needs to be sustainable, open, and built by and for all groups.
13:15 - 13:30
A reproducible workflow and software tools for working with the Global Extreme Sea Level Analysis (GESLA) dataset.
Fernando Mayer, Maynooth University.

Practices of reproducibility.

Abstract: In this talk, we are going to show a general reproducible workflow, in the context of the project entitled "Estimating sea levels and sea-level extremes for Ireland". We will demonstrate a set of software tools used to deal with a large, worldwide, sea level dataset, called GESLA (Global Extreme Sea Level Analysis). This workflow and set of tools can hopefully help other researchers in adopting the practice of reproducibility.
13:30 - 13:45
Qualitative Transparency Tools and Practice in Sexual and Reproductive Health Research.
Marielle Kirstein or Jen Mueller, Guttmacher Institute.

Practices of reproducibility.

Abstract: Reproducibility is fundamental to the open science movement to ensure science is transparent and accessible, but much of the work on reproducibility has come from quantitative research and data. However, the principles of transparency are equally relevant to qualitative researchers despite some unique challenges implementing transparent practices, given the nature of qualitative data. In this presentation, we will introduce the principles and concepts that underpin qualitative transparency and describe how we at the Guttmacher Institute have been developing and implementing qualitative transparency practices in our work. Guttmacher conducts policy-relevant research on sexual and reproductive health, and our qualitative data often includes sensitive content, underlining the ethical imperative to protect participant confidentiality. We will describe how we have embedded transparency into our qualitative projects through the use of transparency launch meetings and checklists, among other practices, and we will highlight previous and current projects at Guttmacher that are making some aspects of their projects publicly available.
13:45 - 14:00
Audience Q&A and/or break.
14:00 - 14:15
TBC.
14:15 - 14:30
TBC.
14:30 - 14:45
TBC.
14:45 - 15:00
TBC.
15:00 - 15:15
TBC.
15:15 - 15:30
TBC.
15:30 - 15:45
TBC.
Nick Radcliffe, Stochastic Solutions.

Practices of reproducibility.

Abstract: TBC.
15:45 - 16:00
Audience Q&A and/or break.
16:00 - 16:15
TBC.
Aya Mitani, University of Toronto.

Evaluating reproducibility.

Abstract: TBC.
16:15 - 16:30
Git is my lab book: "baking in" reproducibility.
Teaching reproducibility.

Abstract: I am part of an infectious diseases modelling group that has informed Australia's national pandemic preparedness and response plans for the past ~15 years. In collaboration with public health colleagues since 2015, we have developed and deployed near-real-time seasonal influenza forecasts. We rapidly adapted these methods to COVID-19 and, since April 2020, near-real-time COVID-19 forecasts have informed public health responses in Australian states and territories. Ensuring that our results are valid and reproducible is a key aspect of our research. We are also part of a broader consortium whose remit includes building sustainable quantitative research capacity in the Asia-Pacific region. In this talk I will discuss how we are trying to embed reproducible research practices into our EMCR cohort, beginning with version control workflows and normalising peer code review as an integral part of academic research.
16:30 - 16:45
The consequences of excel autocorrection on genomic data.
Mandhri Abeysooriya, Deakin University.

Practices of reproducibility.

Abstract: TBC.
16:45 - 17:00
Audience Q&A and/or break.
17:00 - 17:15
Closing Remarks.
Lisa Strug, University of Toronto.
  • 00

    days

  • 00

    hours

  • 00

    minutes

  • 00

    seconds

Local Time

  • Timezone: America/New_York
  • Date: Feb 22 - 23 2023
  • Time: All Day
Maple Room, University of Toronto

Location

Maple Room, University of Toronto
Room UY9014, 9th Floor, 700 University Avenue, Toronto, ON M5G 1Z5