Study Finds Cash Flow and Machine Learning Models Unlock Fairer, Safer Credit Approvals

Study Says Cash Flow Machine Learning Boost Credit Approvals

Writer: Eric Bank

Editor: Lillian Guevara-Castro

Reviewer: Adam West

Updated: July 7, 2025

Experts share their tips and advice on BadCredit.org, with the goal of helping subprime consumers. Our articles follow strict editorial guidelines.

Key Takeaways

Hybrid models excel: Models that integrate cash flow and credit bureau data with machine learning (ML) are more predictive and more inclusive than traditional methods.
Both performance and fairness are possible: ML and cash flow models reduce approval differentials without elevating default risk.
Phased adoption is possible: Banks are not required to adopt both ML and cash flow data at once to see real gains.

More accurate, more inclusive credit scoring isn’t a zero-sum game when risk is concerned. That’s what FinRegLab’s new empirical study found when it tested eight varied models of underwriting based on classic and alternative data sets via both logistic regression (LR) and machine learning (ML) models.

The researchers found that combining cash flow data — data taken directly from consumers’ bank account transactions — into ML algorithms produced the highest gains in predictiveness and inclusiveness.

The findings may offer credit scoring professionals real-world and timely applications. These models, when properly implemented, are able to reach borrowers who are excluded by standard scores without increasing charge-offs.

That offers hope to neglected segments and gives lenders a way to balance fairness and profit. But implementation choices are also relevant — and incremental improvement brings meaningful gains.

ML and Cash Flow Unlock Complementary Gains

FinRegLab contrasted XGBoost-based ML models with LR under four data settings: credit bureau data as conventionally used, cash flow data, and two hybrid settings. The results are clear: Performance is always better with ML, and cash flow data improves performance with both models.

a woman on a laptop graphic — Cash flow data combined with ML algorithms presents a more predictive and inclusive credit scoring model.

For example, out of models constructed based on credit bureau data, ML improved ROC-AUC, a standard indicator of predictiveness, by 1.78 points compared to LR. The hybrid ML model using both bureau and cash flow data improved it by more than two points.

Simulation of approvals found that these ML models would approve up to 4% more applicants at conservative risk thresholds while reducing approvals of future defaulters by at least 9%.

That translates in practical terms to millions of new credit card accounts and hundreds of thousands of new mortgage account approvals made to previously misclassified lower-risk consumers.

Improved Access, Especially for Underserved Borrowers

Despite the dataset skewing toward prime borrowers, FinRegLab found that ML and hybrid models were more successful in each demographic subgroup it evaluated — low-income, minority, and credit-challenged cohorts.

The rate of approval rose most consistently among recent subprime borrowers and fell most significantly among low-to-moderate income purchasers who later defaulted. The models, in short, liberalized access but also made it more precise.

This performance validates the argument that new data and tools have the potential to reduce systemic biases in seasoned models. Using granular outflow and inflow data, cash flow underwriting captures economic robustness that is ignored by traditional scoring.

Practical Paths to Adoption

One huge inhibitor of innovation is operational risk — especially among smaller lenders who are wary of adopting new tech they’re unfamiliar with. FinRegLab’s paper addresses that imbalance by trying out staggered adoption strategies. The takeaway: You don’t have to go all in with full hybrid ML models to progress.

Each individual innovation contributed quantifiable gains. Using ML on traditional bureau data made it more predictable and reduced default risk. Including cash flow data in LR models contributed a gain as well.

These findings suggest that incremental improvements — like adding cash flow to baseline models — can improve performance and diminish bias.

Compliance and Transparency Still Matter

Compliance is always a part of innovation, and ML-based lenders are faced with explainability challenges when it comes to adverse action notices under the Equal Credit Opportunity Act.

They are also challenged when handling fairness and model governance. However, there is early promise that these issues will be addressed through the study. The hybrid ML models were more predictive and delivered more consistency across demographic groups.

Looking Ahead

FinRegLab’s research defines a path forward for credit scoring professionals looking to refresh. Cash flow information and machine learning are not magic bullets, but together they produce a very robust tool that improves risk management and inclusion.

The results are convincing: greater accuracy, fairer outcomes, and more opportunity for consumers who have been excluded for far too long.

Eric Bank has been covering business and financial topics since 1985, specializing in taking complex subject matters and explaining them in simple terms for consumer audiences. Eric's writing appears on Credible.com, eHow, WiseBread, The Nest, Get.com, Zacks, Chron, and dozens of other outlets. A former software engineer, Eric holds an M.B.A. from New York University and an M.S. in finance from DePaul University.

View Eric's Full Profile »

LEARNING	SERVICES	FINANCING
News	Credit Repair	Credit Cards
Studies	Debt Relief	Personal Loans
Guides	Credit Reports	Auto Loans
Reviews	Bank Accounts	Home Loans