Articles
-
[Basics] Part 5. Ranking isn't enough: three axes for evaluating a credit model
How do you know whether you built a good model? In credit you don't just check whether it ranks well (discrimination). You read discrimination with AUC and PR-AUC, check whether the probabilities match reality with calibration, and check whether it holds up over time with PSI. Here are the two axes ordinary ML tends to skip.
-
[Review] Can Google's new tabular foundation model TabFM beat GBM in credit? I tested it on public data
Google's zero-shot tabular foundation model TabFM claims to beat even a well-tuned GBM with no training and no tuning. Can it actually be used on credit losses? A practitioner's review, pitting it against a carefully built GBM on public credit-card data.
-
[Basics] Part 4. Building a credit model: scorecards and trees
If Part 3 was about choosing a model, this piece is about actually building one. How to build a scorecard with logistic regression (WOE, IV, score scaling) and how to build one with trees (features, SHAP, monotone constraints), where the two diverge, and the reject inference and calibration you have to run no matter which model you picked.
-
[Deep Dive] Where do rejected applicants go? Reject inference and rejectkit
A credit model learns only from the people it approved, yet it's judged on every applicant, rejects included. I bundled eight reject-inference techniques for correcting that sample-selection bias behind one API, and — more importantly — built rejectkit, a Python library that measures whether the correction actually helps on your own data. Both are now public.
-
[Paper] SSL falls short of GBM on credit data. But combined, it helps
Can self-supervised learning (SSL) beat GBM at credit default prediction? I ran the experiment on public data (AMEX). On its own, SSL falls short of GBM — but bolted onto GBM's features, it lifts performance by a statistically meaningful margin. And that lift was concentrated in the hidden defaults among customers GBM thought were safe.
-
[Basics] Part 3. Where deep learning doesn't win: machine learning for scoring
Credit data is tabular. And on tabular data, the winner isn't a flashy deep net — it's tree-based boosting. Here's why picking on performance lands you at a tree as the final model, why logistic regression is still in use, and why cross-validation in finance has to be done differently.
-
[Deep Dive] Does raising a credit limit increase defaults? A test on three public datasets
If you raise someone's credit limit, does their probability of default go up or down? Intuition says up, but the data says the opposite: down. This post untangles that paradox with debiasing, tests it on three public datasets, and works out when the sign of the limit effect actually flips.
-
[Basics] Part 2. Statistics first: how to read credit data
Before you reach for machine learning, statistics comes first. In credit, you ask 'is this difference real or just noise?' far more often than 'does the model fit well?' Here's the shape of financial data, the trap of multiple testing, how to handle small samples, and the bias that's baked in by default.
-
[Basics] Part 1. The card business and credit risk: where underwriting models begin
What a model should optimize for comes, in the end, from the business. Here's a walk through where a card issuer earns and where it loses, how credit loss breaks into parts, and how regulation makes its way inside the model. Consider it the domain groundwork for understanding an underwriting model.
-
[Basics] Part 0. 7 ways finance data science differs from ordinary ML
People who are great at everything from building ML models to evaluating them still trip up when they reach credit underwriting. It isn't a skill gap — the field runs on different rules. From selection bias to regulation, here are 7 ways finance data science is structurally different from ordinary ML.
No articles yet.