Working Paper

A New Tool for Robust Estimation and Identification of Unusual Data Points


Abstract: Most consistent estimators are what Müller (2007) terms “highly fragile”: prone to total breakdown in the presence of a handful of unusual data points. This compromises inference. Robust estimation is a (seldom-used) solution, but commonly used methods have drawbacks. In this paper, building on methods that are relatively unknown in economics, we provide a new tool for robust estimates of mean and covariance, useful both for robust estimation and for detection of unusual data points. It is relatively fast and useful for large data sets. Our performance testing indicates that our baseline method performs on par with, or better than, two of the currently best available methods, and that it works well on benchmark data sets. We also demonstrate that the issues we discuss are not merely hypothetical, by re-examining a prominent economic study and demonstrating its central results are driven by a set of unusual points.

Keywords: big data; machine learning; robust estimation; detMCD; RMVN; fragility; outlier identification;

JEL Classification: C3; C4; C5;

https://doi.org/10.26509/frbc-wp-202008

Access Documents

File(s): https://doi.org/10.26509/frbc-wp-202008
Description: Full Text

Authors

Bibliographic Information

Provider: Federal Reserve Bank of Cleveland

Part of Series: Working Papers

Publication Date: 2020-03-05

Number: 20-08