Skip to main content

Student Seminar Series

Co-chaired by Tuo Lin and Anubhav Singh Sachan, the Seminar Series is an opportunity for Biostatistics Program graduate students to highlight their current research activities. See below for the recent and upcoming seminars. 

Upcoming Seminars


"Dynamic treatment effects: high-dimensional inference under model misspecification"

Yuqian Zhang, PhD
3:00 PM – 4:00 PM PST

ABSTRACT:  In this talk, I will introduce the estimation and inference of average treatment effects in dynamic settings where covariates and treatments are longitudinal. We focus on high-dimensional cases when the sample size, N, is potentially much smaller than the covariate vectors dimension, d. The marginal structural mean models are considered. We identify a new, broad doubly (multiply) robust estimator, which we name a "sequential model doubly robust estimator". We achieve root-N inference even when model misspecification occurs. For that purpose, new loss functions and new nuisance parameters are introduced, named "moment targeted", aimed to reduce the bias of model misspecification. New loss functions resolve a long-standing open problem of dynamic double robustness. We identify the weakest conditions up to date that match naive intuition. Multiple time model double robustness is achieved whenever each time exposure is model doubly-robust itself. This significantly extends the literature even in low-dimensions, where the doubly robust property requires a number of complex conditions to hold.

Recent Seminars


"Quantifying heritability and population stratification"

Anubhav Nikunj Singh Sachan
3:00 PM – 4:00 PM PST

ABSTRACT: Often one is interested in quantifying heritability which vaguely is defined as the proportion of the variation of a trait is due to genetic factors. If we had many more observations than genetic predictors then this quantity could be estimated using something like an Adjusted R2. Unfortunately, in genetic studies we have sample sizes in the thousands, and genetic predictors (SNPs) in the millions. I will outline methods that can be used to tackle this issue.  Another issue that arises is “population stratification” where unobserved confounders related to differing genetic ancestries in the sample can bias heritability estimates. I will discuss how and why principal component analysis can sometimes account for this, and an extension to the previously described Schwartzman et al.'s GWASH estimator for estimating heritability under the presence of this.  


"Selective Peak Inference in fMRI" 

Samuel Davenport, PhD
1:00 PM – 2:00 PM PST

ABSTRACT: The spatial signals in neuroimaging mass univariate analyses can be characterized in a number of ways, but one widely used approach is peak inference: the identification of peaks in the image. Peak locations and magnitudes provide a useful summary of activation and are routinely reported, however, the magnitudes reflect selection bias as these points have both survived a threshold and are local maxima. In this talk, Dr. Davenport will discuss the use of resampling methods to estimate and correct this bias in order to estimate both the raw units change as well as standardized effect size measured with Cohen's d and partial R^2. They evaluated their method with a massive open dataset (using imaging data from the UK biobank), and he will discuss how the corrected estimates can be used to perform power analyses.


"Semiparametric Regression Models for Between- and Within-subject Attributes: Applications to High-Dimensional Data, Asymptotic Efficiency and Beyond" 

Jinyuan Liu
1:00 PM – 2:00 PM PST

ABSTRACT: Breakthroughs such as high-throughput sequencing are generating flourishing high-dimensional data that provoke challenges in both statistical analyses and interpretations. Since directly modeling such data often suffers from multiple testing and low power, an emerging alternative is to first reduce the dimension at the outset, by comparing two subjects' genome sequences using dissimilarity metrics, yielding “between-subject attributes.” In the first half of this talk, I will extend the classical generalized linear models (GLM) to establish a new regression paradigm for between-subject attributes, using a class of semiparametric functional response models (FRM). Despite its growing applications, the efficiency of estimators for the FRM has not yet been carefully studied. This is of fundamental importance for semiparametric models due to the efficiency loss at the price of minimum model assumptions. For the next half of the talk, we leverage the Hilbert-Space-based semiparametric efficiency theory to show that estimators from a class of U-statistics-based generalized estimating equation (UGEE) achieve the semiparametric efficiency bound. Thus, like GEE for semiparametric GLM, UGEE estimators also harmonize efficiency and robustness, propelling growing applications in biomedical, psychosocial, and related research.