The Dataset Multiplicity Problem: How Unreliable Data Impacts Predictions
The Dataset Multiplicity Problem: How Unreliable Data Impacts Predictions
We introduce dataset multiplicity, a way to study how inaccuracies, uncertainty, and social bias in training datasets impact test-time predictions. The dataset multiplicity framework asks a counterfactual question of what the set of resultant models (and associated test-time predictions) would be if we could somehow access all hypothetical, unbiased versions …