R-miss-tastic: a unified platform for missing values methods and workflows

Type: Article

Publication Date: 2022-10-10

Citations: 24

DOI: https://doi.org/10.32614/rj-2022-040

Abstract

Missing values are unavoidable when working with data. Their occurrence is exacerbated as more data from different sources become available. However, most statistical models and visualization methods require complete data, and improper handling of missing data results in information loss or biased analyses. Since the seminal work of Rubin (1976), a burgeoning literature on missing values has arisen, with heterogeneous aims and motivations. This led to the development of various methods, formalizations, and tools. For practitioners, it remains nevertheless challenging to decide which method is most suited for their problem, partially due to a lack of systematic covering of this topic in statistics or data science curricula. To help address this challenge, we have launched the "R-miss-tastic" platform, which aims to provide an overview of standard missing values problems, methods, and relevant implementations of methodologies. Beyond gathering and organizing a large majority of the material on missing data (bibliography, courses, tutorials, implementations), "R-miss-tastic" covers the development of standardized analysis workflows. Indeed, we have developed several pipelines in R and Python to allow for hands-on illustration of and recommendations on missing values handling in various statistical tasks such as matrix completion, estimation and prediction, while ensuring reproducibility of the analyses. Finally, the platform is dedicated to users who analyze incomplete data, researchers who want to compare their methods and search for an up-to-date bibliography, and also teachers who are looking for didactic materials (notebooks, video, slides).

Locations

  • The R Journal - View - PDF
  • arXiv (Cornell University) - View - PDF
  • HAL (Le Centre pour la Communication Scientifique Directe) - View - PDF
  • DataCite API - View

Similar Works

Action Title Year Authors
+ Detection of the Missing Values Mechanism with Tests and Models 2023 Matthias Templ
+ A review on missing values for main challenges and methods 2023 Lijuan Ren
Tao Wang
AĂŻcha Sekhari
Haiqing Zhang
Abdelaziz Bouras
+ PDF Chat Review for Handling Missing Data with special missing mechanism 2024 Youran Zhou
Sunil Aryal
Mohamed Reda Bouadjenek
+ Expanding tidy data principles to facilitate missing data exploration, visualization and assessment of imputations 2018 Nicholas Tierney
Dianne H Cook
+ PDF Chat Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations 2023 Nicholas Tierney
Dianne Cook
+ Challenges and strategies in analysis of missing data 2019 Xiao‐Hua Zhou
+ PDF Chat Exploring, handling, imputing and evaluating missing data in statistical analyses: a review of existing approaches 2018 Alyssa Imbert
Nathalie Villa‐Vialaneix
+ Missing Data 2025 Vaiva Deltuvaite‐Thomas
Tomasz Burzykowski
+ Expanding tidy data principles to facilitate missing data exploration, visualization and assessment of imputations 2018 Nicholas Tierney
Dianne Cook
+ PDF Chat Missing data: Our view of the state of the art. 2002 Joseph L. Schafer
John W. Graham
+ PDF Chat Missing Data Methods 2020 Kristian Kleinke
Jost Reinecke
Daniel SalfrĂĄn
Martin Spieß
+ Raoul: An R-Package for Handling Missing Data 2016 David Randahl
+ Missing Data Imputation and Analysis 2011 Mark Chang
+ A review of the current publication trends on missing data imputation over three decades: direction and future research 2022 Farah Adibah Adnan
Khairur Rijal Jamaludin
Wan Zuki Azman Wan Muhamad
Suraya Miskon
+ PDF Chat Missing Data Imputation Toolbox for MATLAB 2016 Abel Folch‐Fortuny
Francisco Moreno
Alberto Ferrer
+ Solas3.0 : For missing data analysis 2001 撌äč‹ ćŒ—æ‘
ć’Œć€« çŻ æŽ„
+ Quality control, data cleaning, imputation 2021 Dawei Liu
Hanne Oberman
Johanna Muñoz
Jeroen Hoogland
Thomas P. A. Debray
+ Missing Data 2018 Steven A. Gilbert
Jared C. Christensen
+ Missing Data 2014 Roderick J. A. Little
+ PDF Chat Missing data 2007 Douglas G. Altman
Martin Bland

Works That Cite This (13)

Action Title Year Authors
+ Does imputation matter? Benchmark for predictive models 2020 Katarzyna WoĆșnica
PrzemysƂaw Biecek
+ PDF Chat Causal Inference Methods for Combining Randomized Trials and Observational Studies: A Review 2024 Bénédicte Colnet
Imke Mayer
Guanhua Chen
Awa Dieng
Ruohong Li
Gaël Varoquaux
Jean-Philippe Vert
Julie Josse
Shu Yang
+ Linear predictor on linearly-generated data with missing values: non consistency and solutions 2020 Marine Le Morvan
Nicolas Prost
Julie Josse
Erwan Scornet
Gaël Varoquaux
+ PDF Chat Causal inference methods for combining randomized trials and observational studies: a review 2020 Bénédicte Colnet
Imke Mayer
Guanhua Chen
Awa Dieng
Ruohong Li
Gaël Varoquaux
Jean‐Philippe Vert
Julie Josse
Shu Yang
+ MissDeepCausal: causal inference from incomplete data using deep latent variable models 2020 Imke Mayer
Julie Josse
FĂ©lix Raimundo
Jean-Philippe Vert
+ Estimation and prediction with data quality indexes in linear regressions 2023 Pierre Chatelain
Xavier Milhaud
+ PDF Chat Generalizing treatment effects with incomplete covariates: Identifying assumptions and multiple imputation algorithms 2023 Imke Mayer
Julie Josse
+ Proper Scoring Rules for Missing Value Imputation 2021 Loris Michel
Jeffrey NĂ€f
Meta-Lina Spohn
Nicolai Meinshausen
+ On the consistency of supervised learning with missing values 2024 Julie Josse
Jacob M. Chen
Nicolas Prost
Gaël Varoquaux
Erwan Scornet
+ A Manifesto for More Productive Psychological Games Research 2022 Nick Ballou