Working Paper

Explaining Machine Learning by Bootstrapping Partial Dependence Functions and Shapley Values


Abstract: Machine learning and artificial intelligence methods are often referred to as “black boxes” when compared with traditional regression-based approaches. However, both traditional and machine learning methods are concerned with modeling the joint distribution between endogenous (target) and exogenous (input) variables. Where linear models describe the fitted relationship between the target and input variables via the slope of that relationship (coefficient estimates), the same fitted relationship can be described rigorously for any machine learning model by first-differencing the partial dependence functions. Bootstrapping these first-differenced functionals provides standard errors and confidence intervals for the estimated relationships. We show that this approach replicates the point estimates of OLS coefficients and demonstrate how this generalizes to marginal relationships in machine learning and artificial intelligence models. We further discuss the relationship of partial dependence functions to Shapley value decompositions and explore how they can be used to further explain model outputs.

Keywords: Machine learning; Artificial intelligence; Explainable machine learning; Shapley values; Model interpretation;

JEL Classification: C14; C15; C18;

https://doi.org/10.18651/RWP2021-12

Access Documents

File(s): File format is application/pdf https://www.kansascityfed.org/documents/8518/rwp21-12cookguptonmodigpalmer.pdf
Description: Full text

Authors

Bibliographic Information

Provider: Federal Reserve Bank of Kansas City

Part of Series: Research Working Paper

Publication Date: 2021-11-15

Number: RWP 21-12