Statistical estimates from survey samples have traditionally been obtained via design-based estimators. In many cases, these estimators tend to work well for quantities such as population totals or means, but can fall short as sample sizes become small. In today's "information age," there is a strong demand for more granular estimates. To meet this demand, using a Bayesian pseudo-likelihood, we propose a computationally efficient unit-level modeling approach for non-Gaussian data collected under informative sampling designs. Specifically, we focus on binary and multinomial data. Our approach is both multivariate and multiscale, incorporating spatial dependence at the area-level. Through the use of a variational Bayes approximation, we are able to accommodate massive data applications; e.g., unit-level data for the entire United States. We illustrate our approach through an empirical simulation study and through a motivating application to health insurance estimates using the American Community Survey. We also extend the approach to a deep learning setting to allow for more complex covariates. We provide further illustration, utilizing this extension, through the use of American National Election Survey data.
Paul Parker is currently a Ph.D. student in the Department of Statistics at the University of Missouri. He is expected to graduate during Summer 2021. His research interests include models for dependent data (e.g. spatial, spatio-temporal, time-series, etc.) often in the context of official statistics and government applications. He also enjoys data science and computational problems, especially when coupled with Bayesian methodology.
As a gentle reminder, please respect the privacy of faculty recruitment by not sharing the candidate status of our guests with others outside of our organization.
Zoom Link: https://ucsc.zoom.us/j/94623791866?pwd=dkQ3STZ0VGpiWnJnb1ozMzVVaXpYQT09