Correcting for Sampling Error in between-Cluster Effects: An Empirical Bayes Cluster-Mean Approach with Finite Population Corrections

Type: Article

Publication Date: 2024-02-13

Citations: 0

DOI: https://doi.org/10.1080/00273171.2024.2307034

Abstract

With clustered data, such as where students are nested within schools or employees are nested within organizations, it is often of interest to estimate and compare associations among variables separately for each level. While researchers routinely estimate between-cluster effects using the sample cluster means of a predictor, previous research has shown that such practice leads to biased estimates of coefficients at the between level, and recent research has recommended the use of latent cluster means with the multilevel structural equation modeling framework. However, the latent cluster mean approach may not always be the best choice as it (a) relies on the assumption that the population cluster sizes are close to infinite, (b) requires a relatively large number of clusters, and (c) is currently only implemented in specialized software such as Mplus. In this paper, we show how using empirical Bayes estimates of the cluster means can also lead to consistent estimates of between-level coefficients, and illustrate how the empirical Bayes estimate can incorporate finite population corrections when information on population cluster sizes is available. Through a series of Monte Carlo simulation studies, we show that the empirical Bayes cluster-mean approach performs similarly to the latent cluster mean approach for estimating the between-cluster coefficients in most conditions when the infinite-population assumption holds, and applying the finite population correction provides reasonable point and interval estimates when the population is finite. The performance of EBM can be further improved with restricted maximum likelihood estimation and likelihood-based confidence intervals. We also provide an R function that implements the empirical Bayes cluster-mean approach, and illustrate it using data from the classic High School and Beyond Study.

Locations

  • PsyArXiv (OSF Preprints) - View - PDF
  • PubMed - View
  • Multivariate Behavioral Research - View

Similar Works

Action Title Year Authors
+ Correcting for sampling error in between-cluster effects: An empirical Bayes cluster-mean approach with finite population corrections 2022 Mark H. C. Lai
Yichi Zhang
Feng Ji
+ Modeling sparsely clustered data: Design-based, model-based, and single-level methods. 2014 Daniel McNeish
+ PDF Chat Modeling Clustered Data with Very Few Clusters 2016 Daniel McNeish
Laura M. Stapleton
+ Evaluating Two Small Sample Corrections for Fixed-Effects Standard Errors and Inferences in Multilevel Models With Heteroscedastic, Unbalanced, Clustered Data 2024 Yichi Zhang
Mark H. C. Lai
+ The Role of Sample Cluster Means in Multilevel Models 2011 Leonardo Grilli
Carla Rampichini
+ PDF Chat Evaluating two small-sample corrections for fixed-effects standard errors and inferences in multilevel models with heteroscedastic, unbalanced, clustered data 2024 Yichi Zhang
Mark H. C. Lai
+ PDF Chat A Practitioner’s Guide to Cluster-Robust Inference 2015 A. Colin Cameron
Douglas L. Miller
+ Cluster-level Correlated Error Variance and the Estimation of Parameters in Linear Mixed Models 2014 Joseph N. Luchman
+ PDF Chat Reconsidering Cluster Bias in Multilevel Data: A Monte Carlo Comparison of Free and Constrained Baseline Approaches 2018 Nigel Guenole
+ Comparing Random Effects Models, Ordinary Least Squares, or Fixed Effects with Cluster Robust Standard Errors for Cross-Classified Data 2021 Young Ri Lee
James E. Pustejovsky
+ PDF Chat Estimating Multilevel Logistic Regression Models When the Number of Clusters is Low: A Comparison of Different Statistical Software Procedures 2010 Peter C. Austin
+ Variance partitioning in multilevel models for count data 2019 George Leckie
Will N. Browne
Harvey Goldstein
Juan Merlo
Peter C. Austin
+ Variance partitioning in multilevel models for count data 2019 George Leckie
William J. Browne
Harvey Goldstein
Juan Merlo
Peter C. Austin
+ Complex sampling designs in large-scale education surveys: a two-level sample distribution approach 2021 Ting Shen
Spyros Konstantopoulos
+ PDF Chat Small-Sample Methods for Cluster-Robust Variance Estimation and Hypothesis Testing in Fixed Effects Models 2016 James E. Pustejovsky
Elizabeth Tipton
+ Accounting for Heteroskedasticity Resulting from Between-Group Differences in Multilevel Models 2022 Francis L. Huang
Wolfgang Wiedermann
Bixi Zhang
+ Finite population correction for two-level hierarchical linear models. 2017 Mark H. C. Lai
Oi‐Man Kwok
Yu‐Yu Hsiao
Qian Cao
+ PDF Chat Measurement Error Correction Formula for Cluster-Level Group Differences in Cluster Randomized and Observational Studies 2015 Sun‐Joo Cho
Kristopher J. Preacher
+ A Latent Cluster-Mean Approach to the Contextual Effects Model With Missing Data 2010 Yongyun Shin
Stephen W. Raudenbush
+ PDF Chat Handling Correlations Between Covariates and Random Slopes in Multilevel Models 2014 Michael D. Bates
Katherine E. Castellano
Sophia Rabe‐Hesketh
Anders Skrondal

Works That Cite This (0)

Action Title Year Authors