Study Finds Cash Flow and Machine Learning Models Unlock Fairer, Safer Credit Approvals

Study Says Cash Flow Machine Learning Boost Credit Approvals

More accurate, more inclusive credit scoring isn’t a zero-sum game when risk is concerned. That’s what FinRegLab’s new empirical study found when it tested eight varied models of underwriting based on classic and alternative data sets via both logistic regression (LR) and machine learning (ML) models. 

The researchers found that combining cash flow data — data taken directly from consumers’ bank account transactions — into ML algorithms produced the highest gains in predictiveness and inclusiveness.

The findings may offer credit scoring professionals real-world and timely applications. These models, when properly implemented, are able to reach borrowers who are excluded by standard scores without increasing charge-offs

That offers hope to neglected segments and gives lenders a way to balance fairness and profit. But implementation choices are also relevant — and incremental improvement brings meaningful gains.

ML and Cash Flow Unlock Complementary Gains

FinRegLab contrasted XGBoost-based ML models with LR under four data settings: credit bureau data as conventionally used, cash flow data, and two hybrid settings. The results are clear: Performance is always better with ML, and cash flow data improves performance with both models.

a woman on a laptop graphic
Cash flow data combined with ML algorithms presents a more predictive and inclusive credit scoring model.

For example, out of models constructed based on credit bureau data, ML improved ROC-AUC, a standard indicator of predictiveness, by 1.78 points compared to LR. The hybrid ML model using both bureau and cash flow data improved it by more than two points.

Simulation of approvals found that these ML models would approve up to 4% more applicants at conservative risk thresholds while reducing approvals of future defaulters by at least 9%.

That translates in practical terms to millions of new credit card accounts and hundreds of thousands of new mortgage account approvals made to previously misclassified lower-risk consumers.

Improved Access, Especially for Underserved Borrowers

Despite the dataset skewing toward prime borrowers, FinRegLab found that ML and hybrid models were more successful in each demographic subgroup it evaluated — low-income, minority, and credit-challenged cohorts.

The rate of approval rose most consistently among recent subprime borrowers and fell most significantly among low-to-moderate income purchasers who later defaulted. The models, in short, liberalized access but also made it more precise.

This performance validates the argument that new data and tools have the potential to reduce systemic biases in seasoned models. Using granular outflow and inflow data, cash flow underwriting captures economic robustness that is ignored by traditional scoring.

Practical Paths to Adoption

One huge inhibitor of innovation is operational risk — especially among smaller lenders who are wary of adopting new tech they’re unfamiliar with. FinRegLab’s paper addresses that imbalance by trying out staggered adoption strategies. The takeaway: You don’t have to go all in with full hybrid ML models to progress.

Each individual innovation contributed quantifiable gains. Using ML on traditional bureau data made it more predictable and reduced default risk. Including cash flow data in LR models contributed a gain as well.

These findings suggest that incremental improvements — like adding cash flow to baseline models — can improve performance and diminish bias.

Compliance and Transparency Still Matter

Compliance is always a part of innovation, and ML-based lenders are faced with explainability challenges when it comes to adverse action notices under the Equal Credit Opportunity Act.

They are also challenged when handling fairness and model governance. However, there is early promise that these issues will be addressed through the study. The hybrid ML models were more predictive and delivered more consistency across demographic groups.

Looking Ahead

FinRegLab’s research defines a path forward for credit scoring professionals looking to refresh. Cash flow information and machine learning are not magic bullets, but together they produce a very robust tool that improves risk management and inclusion.

The results are convincing: greater accuracy, fairer outcomes, and more opportunity for consumers who have been excluded for far too long.