How Magic a Bullet Is Machine Learning for Credit Analysis? An Exploration with FinTech Lending Data
FinTech online lending to consumers has grown rapidly in the post-crisis era. As argued by its advocates, one key advantage of FinTech lending is that lenders can predict loan outcomes more accurately by employing complex analytical tools, such as machine learning (ML) methods. This study applies ML methods, in particular random forests and stochastic gradient boosting, to loan-level data from the largest FinTech lender of personal loans to assess the extent to which thosemethods can produce more accurate out-of-sample predictions of default on future loans relative to standard regression models. To explain loan outcomes, this analysis accounts for the economic conditions faced by a borrower after origination, which are typically absent from other ML studies of default. For the given data, the ML methods indeed improve prediction accuracy, but more so over the near horizon than beyond a year. This study then shows that having more data up to, but not beyond, a certain quantity enhances the predictive accuracy of the ML methods relativeto that of parametric models. The likely explanation is that there has been data or model drift over time, so that methods that fit more complex models with more data can in fact suffer greater out-of-sample misses. Prediction accuracy rises, but only marginally, with additional standard credit variables beyond the core set, suggesting that unconventional data need to be sufficiently informative as a whole to help consumers with little or no credit history. This study further explores whether the greater functional flexibility of ML methods yields unequal benefit to consumers with different attributes or who reside in locales with varying economic conditions. It finds that the ML methods produce more favorable ratings for different groups of consumers, although those already deemed less risky seem to benefit more on balance.
AUTHORS: Perkins, Charles B.; Wang, J. Christina
Testing for Differences in Path Forecast Accuracy: Forecast-Error Dynamics Matter
Although the trajectory and path of future outcomes plays an important role in policy decisions, analyses of forecast accuracy typically focus on individual point forecasts. However, it is important to examine the path forecasts errors since they include the forecast dynamics. We use the link between path forecast evaluation methods and the joint predictive density to propose a test for differences in system path forecast accuracy. We also demonstrate how our test relates to and extends existing joint testing approaches. Simulations highlight both the advantages and disadvantages of path forecast accuracy tests in detecting a broad range of differences in forecast errors. We compare the Federal Reserve?s Greenbook point and path forecasts against four DSGE model forecasts. The results show that differences in forecast-error dynamics can play an important role in the assessment of forecast accuracy.
AUTHORS: Martinez, Andrew
Evaluating Conditional Forecasts from Vector Autoregressions
Many forecasts are conditional in nature. For example, a number of central banks routinely report forecasts conditional on particular paths of policy instruments. Even though conditional forecasting is common, there has been little work on methods for evaluating conditional forecasts. This paper provides analytical,Monte Carlo, and empirical evidence on tests of predictive ability for conditional forecasts from estimated models. In the empirical analysis, we consider forecasts of growth, unemployment, and inflation from a VAR, based on conditions on the short-term interest rate. Throughout the analysis, we focus on tests of bias, efficiency, and equal accuracy applied to conditional forecasts from VAR models.
AUTHORS: Clark, Todd E.; McCracken, Michael W.
Estimating (Markov-Switching) VAR Models without Gibbs Sampling: A Sequential Monte Carlo Approach
Vector autoregressions with Markov-switching parameters (MS-VARs) offer dramatically better data fit than their constant-parameter predecessors. However, computational complications, as well as negative results about the importance of switching in parameters other than shock variances, have caused MS-VARs to see only sparse usage. For our first contribution, we document the effectiveness of Sequential Monte Carlo (SMC) algorithms at estimating MSVAR posteriors. Relative to multi-step, model-specific MCMC routines, SMC has the advantages of being simpler to implement, readily parallelizable, and unconstrained by reliance on convenient relationships between prior and likelihood. For our second contribution, we exploit SMC?s flexibility to demonstrate that the use of priors with superior data fit alters inference about the presence of time variation in macroeconomic dynamics. Using the same data as Sims, Waggoner, and Zha (2008, we provide evidence of recurrent episodes characterized by a flat Phillips Curve.
AUTHORS: Herbst, Edward; Bognanni, Mark
A Class of Time-Varying Parameter Structural VARs for Inference under Exact or Set Identification
This paper develops a new class of structural vector autoregressions (SVARs) with time-varying parameters, which I call a drifting SVAR (DSVAR). The DSVAR is the first structural time-varying parameter model to allow for internally consistent probabilistic inference under exact?or set?identification, nesting the widely used SVAR framework as a special case. I prove that the DSVAR implies a reduced-form representation, from which structural inference can proceed similarly to the widely used two-step approach for SVARs: beginning with estimation of a reduced form and then choosing among observationally equivalent candidate structural parameters via the imposition of identifying restrictions. In a special case, the implied reduced form is a tractable known model for which I provide the first algorithm for Bayesian estimation of all free parameters. I demonstrate the framework in the context of Baumeister and Peersman?s (2013b) work on time variation in the elasticity of oil demand.
AUTHORS: Bognanni, Mark
Monetary Policy and Macroeconomic Stability Revisited
A large literature has established that the Fed? change from a passive to an active policy response to inflation led to US macroeconomic stability after the Great Inflation of the 1970s. This paper revisits the literature?s view by estimating a generalized New Keynesian model using a full-information Bayesian method that allows for equilibrium indeterminacy and adopts a sequential Monte Carlo algorithm. The model empirically outperforms canonical New Keynesian models that confirm the literature?s view. Our estimated model shows an active policy response to inflation even during the Great Inflation. More importantly, a more active policy response to inflation alone does not suffice for explaining the US macroeconomic stability, unless it is accompanied by a change in either trend inflation or policy responses to the output gap and output growth. This extends the literature by emphasizing the importance of the changes in other aspects of monetary policy in addition to its response to inflation.
AUTHORS: Kurozumi, Takushi; Van Zandweghe, Willem; Hirose, Yasuo
Misspecification-robust inference in linear asset pricing models with irrelevant risk factors
We show that in misspecified models with useless factors (for example, factors that are independent of the returns on the test assets), the standard inference procedures tend to erroneously conclude, with high probability, that these irrelevant factors are priced and the restrictions of the model hold. Our proposed model selection procedure, which is robust to useless factors and potential model misspecification, restores the standard inference and proves to be effective in eliminating factors that do not improve the model's pricing ability. The practical relevance of our analysis is illustrated using simulations and empirical applications.
AUTHORS: Gospodinov, Nikolay; Kan, Raymond; Robotti, Cesare
General Aggregation of Misspecified Asset Pricing Models
This paper proposes an entropy-based approach for aggregating information from misspecified asset pricing models. The statistical paradigm is shifted away from parameter estimation of an optimally selected model to stochastic optimization based on a risk function of aggregation across models. The proposed method relaxes the perfect substitutability of the candidate models, which is implicitly embedded in the linear pooling procedures, and ensures that the aggregation weights are selected with a proper (Hellinger) distance measure that satisfies the triangle inequality. The empirical results illustrate the robustness and the pricing ability of the aggregation approach to stochastic discount factor models.
AUTHORS: Gospodinov, Nikolay; Maasoumi, Esfandiar
The Uniform Validity of Impulse Response Inference in Autoregressions
Existing proofs of the asymptotic validity of conventional methods of impulse response inference based on higher-order autoregressions are pointwise only. In this paper, we establish the uniform asymptotic validity of conventional asymptotic and bootstrap inference about individual impulse responses and vectors of impulse responses when the horizon is fixed with respect to the sample size. For inference about vectors of impulse responses based on Wald test statistics to be uniformly valid, lag-augmented autoregressions are required, whereas inference about individual impulse responses is uniformly valid under weak conditions even without lag augmentation. We introduce a new rank condition that ensures the uniform validity of inference on impulse responses and show that this condition holds under weak conditions. Simulations show that the highest finite-sample accuracy is achieved when bootstrapping the lag-augmented autoregression using the bias adjustments of Kilian (1999). The conventional bootstrap percentile interval for impulse responses based on this approach remains accurate even at long horizons. We provide a formal asymptotic justification for this result.
AUTHORS: Inoue, Atsushi; Kilian, Lutz
Refining the Workhorse Oil Market Model
The Kilian and Murphy (2014) structural vector autoregressive model has become the workhorse model for the analysis of oil markets. I explore various refinements and extensions of this model, including the effects of (1) correcting an error in the measure of global real economic activity, (2) explicitly incorporating narrative sign restrictions into the estimation, (3) relaxing the upper bound on the impact price elasticity of oil supply, (4) evaluating the implied posterior distribution of the structural models, and (5) extending the sample. I demonstrate that the substantive conclusions of Kilian and Murphy (2014) are largely unaffected by these changes.
AUTHORS: Zhou, Xiaoqing