Making CEL faster: from AST interpreter to compiled closures

Five independent optimizations for cel-rust’s CEL evaluator — from simple regex caching to a full expression closure compiler with typed Schema — delivering 16-31× speedup on ARM64 Neoverse N1.

May 27, 2026 · 10 min · Maxime Guerreiro

RTX 5090 power scaling: 450W vs 575W training

RTX 5090 power scaling from 400W to 600W on a personal workstation. Lower TDP saves ~€34/year at 80% idle and reduces sustained thermal stress in a residential build. 475W–500W is the practical sweet spot between speed and peace of mind.

May 21, 2026 · 10 min · Maxime Guerreiro

RTX 5090 power scaling: 450W vs 575W training

RTX 5090 power scaling from 400W to 600W on a personal workstation. Lower TDP saves ~€34/year at 80% idle and reduces sustained thermal stress in a residential build. 475W–500W is the practical sweet spot between speed and peace of mind.

May 21, 2026 · 7 min · Maxime Guerreiro

How I use agents to write this blog

The actual workflow: from idea to published finding through a loop of playgrounds, benchmarks, and iterative drafts.

May 21, 2026 · 6 min · Maxime Guerreiro

How I use agents to write this blog

The actual workflow: from idea to published finding through a loop of playgrounds, benchmarks, and iterative drafts.

May 21, 2026 · 5 min · Maxime Guerreiro

Soft distillation vs. gradient boosting on fraud

We benchmarked 52 method variants across 22 fraud and non-fraud configs. On hard fraud data, every gradient booster crushes TabPFN/TabICL by 15–20 AUC points while being 4–7× faster. Soft distillation helps only at medium scale. Teacher-as-feature is catastrophic. We quantify effect sizes with Cohen’s d and show why production fraud teams should think twice about foundation models.

May 20, 2026 · 19 min · Maxime Guerreiro

Soft distillation vs. gradient boosting on fraud

We benchmarked 52 method variants across 22 fraud and non-fraud configs. On hard fraud data, every gradient booster crushes TabPFN/TabICL by 15–20 AUC points while being 4–7× faster. Soft distillation helps only at medium scale. Teacher-as-feature is catastrophic. We quantify effect sizes with Cohen’s d and show why production fraud teams should think twice about foundation models.

May 20, 2026 · 15 min · Maxime Guerreiro

Replicating Talking Trees: LLMs for fraud detection

We replicate the Talking Trees method (Yandex Research, 2025) on fraud-detection datasets using Kimi K2.6 and GPT-5.5. The LLM-guided tree beats sklearn by +0.04 AUC but is crushed by XGBoost (+0.11 AUC) at 1000× the cost. Kimi achieves higher peak accuracy but falls back 40% of the time; GPT-5.5 is more reliable (7% fallback) but slightly weaker.

May 20, 2026 · 9 min · Maxime Guerreiro

Replicating Talking Trees: LLMs for fraud detection

We replicate the Talking Trees method (Yandex Research, 2025) on fraud-detection datasets using Kimi K2.6 and GPT-5.5. The LLM-guided tree beats sklearn by +0.04 AUC but is crushed by XGBoost (+0.11 AUC) at 1000× the cost. Kimi achieves higher peak accuracy but falls back 40% of the time; GPT-5.5 is more reliable (7% fallback) but slightly weaker.

May 20, 2026 · 6 min · Maxime Guerreiro

Allocator shootout for async Rust on ARM64

jemalloc’s MADV_DONTNEED strategy triggers hundreds of thousands of aggressive page returns to the OS during large-message Tokio MPSC benchmarks, producing millions of demand-zero page faults. At 16 KB messages this causes a 62% regression versus std; the same allocator wins by 2× on small-object task spawn churn. The effect is allocation-size dependent, not async-pattern dependent.

May 19, 2026 · 14 min · Maxime Guerreiro