Credit & finance data science that holds up in practice.
Why data science behaves differently in credit underwriting than in ordinary ML. Selection bias, causal inference, calibration, validation, fairness, and regulation, in the language of practice.
Latest
-
[Basics] Part 5. Ranking isn't enough: three axes for evaluating a credit model
How do you know whether you built a good model? In credit you don't just check whether it ranks well (discrimination). You read discrimination with AUC and PR-AUC, check whether the probabilities match reality with calibration, and check whether it holds up over time with PSI. Here are the two axes ordinary ML tends to skip.
-
[Review] Can Google's new tabular foundation model TabFM beat GBM in credit? I tested it on public data
Google's zero-shot tabular foundation model TabFM claims to beat even a well-tuned GBM with no training and no tuning. Can it actually be used on credit losses? A practitioner's review, pitting it against a carefully built GBM on public credit-card data.
-
[Basics] Part 4. Building a credit model: scorecards and trees
If Part 3 was about choosing a model, this piece is about actually building one. How to build a scorecard with logistic regression (WOE, IV, score scaling) and how to build one with trees (features, SHAP, monotone constraints), where the two diverge, and the reject inference and calibration you have to run no matter which model you picked.
-
[Deep Dive] Where do rejected applicants go? Reject inference and rejectkit
A credit model learns only from the people it approved, yet it's judged on every applicant, rejects included. I bundled eight reject-inference techniques for correcting that sample-selection bias behind one API, and — more importantly — built rejectkit, a Python library that measures whether the correction actually helps on your own data. Both are now public.
-
[Paper] SSL falls short of GBM on credit data. But combined, it helps
Can self-supervised learning (SSL) beat GBM at credit default prediction? I ran the experiment on public data (AMEX). On its own, SSL falls short of GBM — but bolted onto GBM's features, it lifts performance by a statistically meaningful margin. And that lift was concentrated in the hidden defaults among customers GBM thought were safe.
-
[Basics] Part 3. Where deep learning doesn't win: machine learning for scoring
Credit data is tabular. And on tabular data, the winner isn't a flashy deep net — it's tree-based boosting. Here's why picking on performance lands you at a tree as the final model, why logistic regression is still in use, and why cross-validation in finance has to be done differently.
Get new posts by email
I’ll email you when a new piece goes up. No spam, and you can unsubscribe anytime.