Credit & finance data scientist, based in Japan

Credit & finance data science that holds up in practice.

Why data science behaves differently in credit underwriting than in ordinary ML. Selection bias, causal inference, calibration, validation, fairness, and regulation, in the language of practice.

Start here Subscribe

BLOG

Latest

Basics Jul 2, 2026

[Basics] Part 5. Ranking isn't enough: three axes for evaluating a credit model

How do you know whether you built a good model? In credit you don't just check whether it ranks well (discrimination). You read discrimination with AUC and PR-AUC, check whether the probabilities match reality with calibration, and check whether it holds up over time with PSI. Here are the two axes ordinary ML tends to skip.
Review Jul 2, 2026

[Review] Can Google's new tabular foundation model TabFM beat GBM in credit? I tested it on public data

Google's zero-shot tabular foundation model TabFM claims to beat even a well-tuned GBM with no training and no tuning. Can it actually be used on credit losses? A practitioner's review, pitting it against a carefully built GBM on public credit-card data.
Basics Jun 29, 2026

[Basics] Part 4. Building a credit model: scorecards and trees

If Part 3 was about choosing a model, this piece is about actually building one. How to build a scorecard with logistic regression (WOE, IV, score scaling) and how to build one with trees (features, SHAP, monotone constraints), where the two diverge, and the reject inference and calibration you have to run no matter which model you picked.
Deep Dive Jun 29, 2026

[Deep Dive] Where do rejected applicants go? Reject inference and rejectkit

A credit model learns only from the people it approved, yet it's judged on every applicant, rejects included. I bundled eight reject-inference techniques for correcting that sample-selection bias behind one API, and — more importantly — built rejectkit, a Python library that measures whether the correction actually helps on your own data. Both are now public.
Paper Jun 28, 2026

[Paper] SSL falls short of GBM on credit data. But combined, it helps

Can self-supervised learning (SSL) beat GBM at credit default prediction? I ran the experiment on public data (AMEX). On its own, SSL falls short of GBM — but bolted onto GBM's features, it lifts performance by a statistically meaningful margin. And that lift was concentrated in the hidden defaults among customers GBM thought were safe.
Basics Jun 25, 2026

[Basics] Part 3. Where deep learning doesn't win: machine learning for scoring

Credit data is tabular. And on tabular data, the winner isn't a flashy deep net — it's tree-based boosting. Here's why picking on performance lands you at a tree as the final model, why logistic regression is still in use, and why cross-validation in finance has to be done differently.

See all →

Get new posts by email

I’ll email you when a new piece goes up. No spam, and you can unsubscribe anytime.

No spam · unsubscribe anytime

Credit & finance data science that holds up in practice.

Latest

[Basics] Part 5. Ranking isn't enough: three axes for evaluating a credit model

[Review] Can Google's new tabular foundation model TabFM beat GBM in credit? I tested it on public data

[Basics] Part 4. Building a credit model: scorecards and trees

[Deep Dive] Where do rejected applicants go? Reject inference and rejectkit

[Paper] SSL falls short of GBM on credit data. But combined, it helps

[Basics] Part 3. Where deep learning doesn't win: machine learning for scoring

Get new posts by email