Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem
Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem
Combining machine learning with econometric analysis is becoming increasingly prevalent in both research and practice. A common empirical strategy uses predictive modeling techniques to "mine" variables of interest from available data, then includes those variables into an econometric framework to estimate causal effects. However, because the predictions from machine learning …