STAGE ISSS: Jonathan Marchini

Join us for the next instalment of the STAGE International Speaker Seminar Series (ISSS) with

Dr. Jonathan Marchini

Executive Director, Head of Statistical Genetics and Machine Learning
Regeneron Genetics Center

Talk Title:

Statistical methods for large scale genetic association studies


The study of rare genetic variation, which can be important in the development of complex diseases, has been increasingly carried out thanks to advances in sequencing technologies. The inherent challenges posed by the rarity of these variants and the need for large sample sizes have required the development of gene-based tests. These tests, offering enhanced statistical power over single variant tests, aggregate information across multiple variants and can integrate external functional annotations to improve power of rare variant analysis. We will describe several existing and novel features in the association tool REGENIE and related programs which have been developed at the Regeneron Genetic Center (RGC) to carry out analyses of over 2 million exome-sequenced and genotyped individuals across a diverse set of cohorts with many thousands of phenotypes. We highlight the power of meta-analysis in genetic studies; this involves combining information across studies without requiring access to individual level data. It can be performed using summary statistics from a genome-wide scan of individual variants, or from gene-based tests that aggregate variants within a gene. The former approach is effective at identifying common variants with modest effects, while the latter boosts power for detecting rare variant associations. In this vein, we showcase REMETA, a tool designed for the efficient meta-analysis of gene-based tests in rare variant studies suitable for biobank-scale data sets. REMETA amalgamates results from multiple studies, enhancing the statistical power and reliability of the findings. We demonstrate the usefulness of these approaches for rare variant association testing through large-scale data applications.

Speaker Profile:

Jonathan Marchini is Head of Statistical Genetics and Machine Learning at the Regeneron Genetics Center. His research spans statistical and population genomics and machine learning. He has pioneered the statistical method of genotype imputation, in which correlations between genetic variants (known as linkage disequilibrium) are used to predict/impute genotypes at genetic variants not directed typed. This approach has been used in all almost genome- wide association studies (GWAS) since its inception in 2006, and is basis upon which many international meta-analysis consortia have been built.

He has published methodological papers on SNP and CNV genotype calling, Bayesian association tests, population structure inference, gene-gene interactions, gene-environment interactions, haplotype estimation, genotype imputation, variant calling from sequencing, multi-phenotype analysis, tensor decomposition for rna-sequence analysis and multi- omics data integration. More recently, the REGENIE machine learning method has revolutionised the use of whole genome wide regression and linear mixed models in GWAS.

His software and methods (such as imputation) were used to analyze the first ever large GWAS in 2007 as part of the Wellcome Trust Case Control Consortium (WTCCC) and set the standard for the development of this field. His work on phasing and imputation from sequence data were central methodological developments used by the 1000 Genomes Project and the UK Biobank Projects. His research group carried out the first large scale analysis of thousands of MRI brain imaging derived phenotypes from the UK Biobank. He co-led the Haplotype Reference Consortium, which brought together the largest collection of human whole genome sequence data to create an imputation panel that has been widely used in by the human genetics community. His has received funding from the MRC, Wellcome Trust and the European Research Council. In 2012 Jonathan was awarded a Philip Leverhulme Research Prize.


CANSSI Ontario STAGE (STAGE) is a training program in genetic epidemiology and statistical genetics, housed at the University of Toronto Dalla Lana School of Public Health, and funded by CANSSI Ontario at U of T, an extra-departmental unit in the Faculty of Arts & Science that is home to the Ontario Regional Centre of the Canadian Statistical Sciences Institute (CANSSI).

Seminars are sponsored by The Hospital for Sick Children (Genetics & Genome Biology Program), the Lunenfeld-Tanenbaum Research Institute, and the McLaughlin Centre at the University of Toronto.

Photography Disclosure:

Photographs and/or video may be taken of participants at STAGE events. These photos/videos are for the Program’s use only and may appear on its website, in printed brochures, or in other promotional or reporting materials. By attending STAGE events, you accept the possibility that you may be videotaped or photographed. If you have any concerns, please inform us by sending an e-mail to

  • 00


  • 00


  • 00


  • 00


Local Time

  • Timezone: America/New_York
  • Date: Jun 07 2024
  • Time: 12:00 pm - 1:00 pm


Zoom (Online)
Zoom (Live Stream)


CANSSI Ontario


CANSSI Ontario


Philip Awadalla
Philip Awadalla

Director, Computational Biology, OICR