
Key Takeaways
- Hybrid models excel: Models that integrate cash flow and credit bureau data with machine learning (ML) are more predictive and more inclusive than traditional methods.
- Both performance and fairness are possible: ML and cash flow models reduce approval differentials without elevating default risk.
- Phased adoption is possible: Banks are not required to adopt both ML and cash flow data at once to see real gains.
More accurate, more inclusive credit scoring isn’t a zero-sum game when risk is concerned. That’s what FinRegLab’s new empirical study found when it tested eight varied models of underwriting based on classic and alternative data sets via both logistic regression (LR) and machine learning (ML) models.
The researchers found that combining cash flow data — data taken directly from consumers’ bank account transactions — into ML algorithms produced the highest gains in predictiveness and inclusiveness.
The findings may offer credit scoring professionals real-world and timely applications. These models, when properly implemented, are able to reach borrowers who are excluded by standard scores without increasing charge-offs.
That offers hope to neglected segments and gives lenders a way to balance fairness and profit. But implementation choices are also relevant — and incremental improvement brings meaningful gains.
ML and Cash Flow Unlock Complementary Gains
FinRegLab contrasted XGBoost-based ML models with LR under four data settings: credit bureau data as conventionally used, cash flow data, and two hybrid settings. The results are clear: Performance is always better with ML, and cash flow data improves performance with both models.

For example, out of models constructed based on credit bureau data, ML improved ROC-AUC, a standard indicator of predictiveness, by 1.78 points compared to LR. The hybrid ML model using both bureau and cash flow data improved it by more than two points.
Simulation of approvals found that these ML models would approve up to 4% more applicants at conservative risk thresholds while reducing approvals of future defaulters by at least 9%.
That translates in practical terms to millions of new credit card accounts and hundreds of thousands of new mortgage account approvals made to previously misclassified lower-risk consumers.
Improved Access, Especially for Underserved Borrowers
Despite the dataset skewing toward prime borrowers, FinRegLab found that ML and hybrid models were more successful in each demographic subgroup it evaluated — low-income, minority, and credit-challenged cohorts.
The rate of approval rose most consistently among recent subprime borrowers and fell most significantly among low-to-moderate income purchasers who later defaulted. The models, in short, liberalized access but also made it more precise.
This performance validates the argument that new data and tools have the potential to reduce systemic biases in seasoned models. Using granular outflow and inflow data, cash flow underwriting captures economic robustness that is ignored by traditional scoring.
Practical Paths to Adoption
One huge inhibitor of innovation is operational risk — especially among smaller lenders who are wary of adopting new tech they’re unfamiliar with. FinRegLab’s paper addresses that imbalance by trying out staggered adoption strategies. The takeaway: You don’t have to go all in with full hybrid ML models to progress.
Each individual innovation contributed quantifiable gains. Using ML on traditional bureau data made it more predictable and reduced default risk. Including cash flow data in LR models contributed a gain as well.
These findings suggest that incremental improvements — like adding cash flow to baseline models — can improve performance and diminish bias.
Compliance and Transparency Still Matter
Compliance is always a part of innovation, and ML-based lenders are faced with explainability challenges when it comes to adverse action notices under the Equal Credit Opportunity Act.
They are also challenged when handling fairness and model governance. However, there is early promise that these issues will be addressed through the study. The hybrid ML models were more predictive and delivered more consistency across demographic groups.
Looking Ahead
FinRegLab’s research defines a path forward for credit scoring professionals looking to refresh. Cash flow information and machine learning are not magic bullets, but together they produce a very robust tool that improves risk management and inclusion.
The results are convincing: greater accuracy, fairer outcomes, and more opportunity for consumers who have been excluded for far too long.