2024 Distinguished Lecture Series in Statistical Sciences: Susan Holmes

Free Hybrid Event (Virtual/In person) | Registration Required

Join us for this year’s Distinguished Lecture Series in Statistical Sciences with:

Susan Holmes

Professor
Department of Statistics
Stanford University

Speaker Profile

Susan Holmes has been working in non parametric multivariate statistics applied to Biology since 1985. She started her research career in France at the INRAE institute in Montpellier. She has taught at MIT, Harvard and was an Associate Professor of Biometry at Cornell before moving to Stanford in 1998. She  likes working on big messy data sets, mostly from the areas of Immunology, Cancer Biology and Microbial Ecology and her group developed the popular Bioconductor packages phyloseq and dada2 for microbiome data analyses.

Professor Holmes has co-authored an open access book  with Wolfgang Huber (EMBL)  published by Cambridge University Press on Modern Statistics for Modern Biology  based on a popular course she teaches at Stanford. Her work is funded by the NIH and the Bill and Melinda Gates foundation. Her theoretical interests include applied probability, MCMC (Monte Carlo Markov chains), Graph Limit Theory, Differential Geometry and the topology of the space of Phylogenetic Trees.

Day 1

Day 2


Hourly Schedule

Wednesday, May 22, 2024

3:30 - 3:35
CANSSI Ontario Introduction & Welcome
Welcome Remarks
Speakers:
Sanjeena Dang
3:35 - 4:30
Talk Title: Hidden variables: using statistics to decode heterogeneous microbiome data
Most studies of clinical or environmental microbiota involve data that are heterogenous at multiple levels. Some of the studies involve response variables that we aim to predict and understand, preterm birth, growth rates in undernourished children, insulin levels in diabetes are some examples. Standard statistical methods usefully separate unknown parameters from the data themselves and provide insight into the optimality properties of some standard estimates. This clarification provides useful insight into uncertainty quantification and enables optimized downstream experimental design. Analogies with methods in textual analyses (Natural Language Processing) such as the use of latent variables methods provides useful interpretations as shown by Sankaran and Holmes, 2018. Testing in the context of combined heterogeneous longitudinal data in perturbation studies of the human microbiome can be even more challenging because there are often a small number of samples with strong dependencies as well as a large number of features from multiple domains. These provide interesting data science challenges where mathematical models of the underlying factors can be plagued with non-identifiability that can make effective uncertainty quantification difficult. We have shown that Bayesian and Bootstrap approaches can provide nonparametric answers to the statistical challenges and have supplemented these with effective visualization techniques distributed as R packages (phyloseq, agPCA, treelapse, bootLong, dada2). This presentation will include joint work with Kris Sankaran, Julia Fukuyama, Ben Callahan, Claire Donnat, Joey McMurdie, Pratheepa Jeganathan, Lan Huong Nguyen and David Relman's group at Stanford.

Speakers:
Susan Holmes
4:30 - 5:30
Reception

Thursday, May 23, 2024

3:30 - 4:30
Talk Title: Statistics and Geometry for Heterogeneous Data
Today's challenges in immunology and microbiology center around the quantification of uncertainty and the design of experiments for heterogeneous multimodal data. We often have tens of thousands of features and only a few hundred samples. We need to create embeddings for graphs, trees and other non Euclidean objects. Using the sample/feature duality in the data can often provide effective low dimensional representations. However some of the nonlinearities in the underlying factors and non-uniformity in the sampling pose extra challenges. Using local methods inspired by differential geometry, special maps and transformations can enable us to construct accompanying uncertainty contours even for data on curved manifolds. This talk gives examples where we have built software and geometrical tools that provide consensus spaces where we can build the uncertainty maps that we need when designing follow-up experiments. This contains joint work with my past lab members: Lan Huong Nguyen, Elisabeth Purdom, Christof Seiler, Nina Miolane, Claire Donnat, Kris Sankaran and Laura Symul.

Speakers:
Susan Holmes
Sanjeena Dang
Sanjeena Dang
Associate Professor, School of Mathematics and Statistics, Carleton University
Associate Professor, School of Mathematics and Statistics, Carleton University
Susan Holmes
Susan Holmes
Professor of Statistics, Stanford University

The event is finished.

Local Time

  • Timezone: America/New_York
  • Date: May 22 - 23 2024
  • Time: 3:30 pm - 4:30 pm
U of T - Rooms 9014 & 9016

Location

U of T - Rooms 9014 & 9016
700 University Avenue, Toronto, ON

Speaker

CANSSI Ontario

Organizer

CANSSI Ontario
Email
esther.berzunza@utoronto.ca
Website
https://canssiontario.utoronto.ca

Moderator

Sanjeena Dang
Sanjeena Dang

Associate Professor & CRC in Data Science & Analytics, School of Mathematics and Statistics, Carleton University