CANSSI Ontario Postdoctoral Fellowship in Genome Data Science – Award Recipient: Yixiao Zeng

CANSSI Ontario is pleased to announce Yixiao Zeng as the recipient of a CANSSI Ontario Postdoctoral Fellowship in Genome Data Science for his project Next-Gen Scalable Linear Mix Models for Enhanced GWAS at Biobank scale. 

Dr. Yixiao Zeng brings a multidisciplinary background in computational genomics. Originally trained in financial engineering, he transitioned to life sciences driven by a strong curiosity in human genetics, and earned his Ph.D. in Quantitative Life Sciences at McGill University under the mentorship of Dr. Celia Greenwood. His doctoral research integrated biostatistics, bioinformatics, machine learning, and high-dimensional data analysis to address complex biological questions.

A dedicated supporter of open science, Dr. Zeng creates and maintains easy-to-use, open-source software. His recent tools include missoNet, a multi-trait regression framework that handles missing data by using correlations across related phenotypes, and DKLasso/DKLasso+, deep learning architectures that combine deep Gaussian processes with feature selection to improve transparency and reliability in risk-sensitive applications. These methods are already used in genomics, neuroscience, and epidemiology for feature selection and multi-omics integration. Dr. Zeng also works with clinical teams in psychology and oncology to bring solid data science methods into translational research. Dr. Zeng will begin a postdoctoral fellowship at the University of Toronto with Professors Gary BaderCharles Boone, and Michael Wainberg.

This research will focus on building scalable computational methods for large biobank genomic and health datasets, addressing challenges such as population stratification, phenotypic heterogeneity, and environmental interactions in studies with hundreds of thousands of participants and millions of genetic variants

CO PDF Project Synopsis

Background:

Genome-wide association studies (GWAS) of large biobank data (e.g., UK Biobank, All of Us) have deepened our understanding of complex traits. Yet, issues in population stratification, computational scalability, binary-trait modeling, and rare-variant analysis persist.

Next-Generation GWAS Toolkits: 

Dr. Zeng will expand upon the Scalable Linear Mixed Model (SLMM) to develop a novel, integrated platform for GWAS. Key features (aims) includes:
  • High-performance LMM. SLMM provides exact likelihood inference with computational speed orders of magnitude faster than standard tools (e.g., GCTA), near-linear scaling across computing nodes, and out-of-core capabilities for datasets beyond memory limits.
  • Binary Trait Extension. Incorporation of logistic mixed models with calibration via Saddlepoint Approximation and Firth correction to ensure unbiased case–control GWAS.
  • Rare Variant Analysis. Implementation of region-based joint tests (e.g., SKAT-O, burden–SKAT) with minor allele frequency weighting to maximize detection power for low-frequency variants.
  • GPU Acceleration. Leveraging GPU architectures to accelerate matrix operations (genetic relationship matrices, preconditioned conjugate gradients, score tests) for cohort analyses at unprecedented scale.


Innovation and Impact:

By uniting statistical rigor with computational efficiency, the enhanced SLMM framework will enable comprehensive GWAS on biobank-scale cohorts without shrinking data or simplifying model assumptions. It aims to be the fastest and most reliable GWAS tool available, capable of analyses once thought out of reach. This integrated platform will support common SNP tests, binary traits, and rare-variant analyses, adapting flexibly to different dataset needs.

Consequently, this toolkit will let researchers to identify genetic associations, biomarkers, and predictive models with higher accuracy and resolution across diverse populations, thereby accelerating discoveries in precision medicine and population genomics.

About

The CANSSI Ontario Postdoctoral Fellowship in Genome Data Science is designed to support the methodological work of an early-career investigator working in genomics and data science with an emphasis on new genomic technologies or multi-omic integration. The goal of the award is to attract and retain top-tier postdoctoral talent, both nationally and internationally.

The Fellowship offers two-year salary support for up to $50,000 CAD annually for postdoctoral fellows undertaking full-time research at a CANSSI Ontario partner university or their affiliated research institutes.

Candidates are responsible for selecting, contacting, and securing the commitment of two faculty members to jointly supervise them in their project, where at least one is a faculty member with a PhD in statistics, biostatistics, epidemiology, computational biology, genomics, or computer science. The second supervisor can be from any other field.

CANSSI Ontario is the Ontario Regional Centre of the Canadian Statistical Sciences Institute (CANSSI). Its goal is to strengthen and enhance research and training in data science by developing programs that promote interdisciplinary research and enable multidisciplinary collaborations.

SHARE THIS POST

Share on facebook
Share on twitter
Share on linkedin