Constructing Applicants from Loan-Level Data: A Case Study of Mortgage Applications

Elzayn, Hadi; Freyaldenhoven, Simon; Shin, Minchul

doi:10.21799/frbp.wp.2025.05

Working Paper

Constructing Applicants from Loan-Level Data: A Case Study of Mortgage Applications

Abstract: We develop a clustering-based algorithm to detect loan applicants who submit multiple applications (“cross-applicants”) in a loan-level dataset without personal identifiers. A key innovation of our approach is a novel evaluation method that does not require labeled training data, allowing us to optimize the tuning parameters of our machine learning algorithm. By applying this methodology to Home Mortgage Disclosure Act (HMDA) data, we create a unique dataset that consolidates mortgage applications to the individual applicant level across the United States. Our preferred specification identifies cross-applicants with 93 percent precision

JEL Classification: C38; C63; C81; G21; R21;

https://doi.org/10.21799/frbp.wp.2025.05

Access Documents

File(s): File format is application/pdf https://www.philadelphiafed.org/-/media/FRBP/Assets/working-papers/2025/wp25-05.pdf

Authors

Elzayn, Hadi

Freyaldenhoven, Simon

Shin, Minchul

Bibliographic Information

Provider: Federal Reserve Bank of Philadelphia

Part of Series: Working Papers

Publication Date: 2025-02-04

Number: 25-05

Fed in Print

Constructing Applicants from Loan-Level Data: A Case Study of Mortgage Applications

Access Documents

Authors

Bibliographic Information