Posts on Maxime Guerreiro

Posts on Maxime Guerreirohttps://punkeel.com/posts/Recent content in Posts on Maxime GuerreiroMaxime Guerreirohttps://punkeel.com/og-image.pnghttps://punkeel.com/og-image.pngHugoenThu, 21 May 2026 18:17:51 +0000RTX 5090 power scaling: 450W vs 575W traininghttps://punkeel.com/posts/gpu-power-scaling-llm-training/Thu, 21 May 2026 18:17:51 +0000https://punkeel.com/posts/gpu-power-scaling-llm-training/RTX 5090 power scaling for home workstations: lower TDP saves real energy per training run and reduces thermal stress, with only 7–11% wall-time penalty at 475W–500W. Yearly savings are modest for an idle-heavy personal machine (~€34), but the safety margin is real.How I use agents to write this bloghttps://punkeel.com/posts/agent-workflow/Thu, 21 May 2026 11:17:01 +0000https://punkeel.com/posts/agent-workflow/The actual workflow: from idea to published finding through a loop of playgrounds, benchmarks, and iterative drafts.Soft distillation vs. gradient boosting on fraudhttps://punkeel.com/posts/fraud-benchmark-v4-soft-distillation/Wed, 20 May 2026 19:44:34 +0000https://punkeel.com/posts/fraud-benchmark-v4-soft-distillation/We benchmarked 52 method variants across 22 fraud and non-fraud configs. On hard fraud data, every gradient booster crushes TabPFN/TabICL by 15–20 AUC points while being 4–7× faster. Soft distillation helps only at medium scale. Teacher-as-feature is catastrophic. We quantify effect sizes with Cohen’s d and show why production fraud teams should think twice about foundation models.Replicating Talking Trees: LLMs for fraud detectionhttps://punkeel.com/posts/fraud-benchmark-llm-tabular/Wed, 20 May 2026 15:14:28 +0000https://punkeel.com/posts/fraud-benchmark-llm-tabular/We replicate the Talking Trees method (Yandex Research, 2025) on fraud-detection datasets using Kimi K2.6 and GPT-5.5. The LLM-guided tree beats sklearn by +0.04 AUC but is crushed by XGBoost (+0.11 AUC) at 1000× the cost. Kimi achieves higher peak accuracy but falls back 40% of the time; GPT-5.5 is more reliable (7% fallback) but slightly weaker.Allocator shootout for async Rust on ARM64https://punkeel.com/posts/allocator-shootout-async/Tue, 19 May 2026 17:12:45 +0000https://punkeel.com/posts/allocator-shootout-async/jemalloc's MADV_DONTNEED strategy triggers aggressive page returns to the OS during large-message Tokio MPSC benchmarks, producing demand-zero page fault storms. At 16 KB this causes a 62% regression versus std, yet the same allocator wins by 2× on small-object churn. Allocation-size dependent, not async-pattern dependent.Fine-tuning TabICL: when 30 epochs of GPU time buys you 0.3 pphttps://punkeel.com/posts/tabicl-finetuning/Mon, 18 May 2026 17:38:01 +0000https://punkeel.com/posts/tabicl-finetuning/TabICL exposes a built-in fine-tuning pipeline via FinetunedTabICLClassifier. On five real-world classification datasets, I compared zero-shot TabICL against fine-tuned TabICL (30 epochs, early stopping, validation-driven hyperparameter selection). The result: fine-tuning helps on some datasets, hurts on others, and never moves AUC by more than ±0.7 pp. On telco-churn it is consistently beneficial (+0.16 to +0.59 pp). On cc-fraud it is completely flat — zero-shot is already near-perfect. The only consistent signal is that fine-tuning with too little data or the wrong seed can degrade performance.Agent architecture: where the work runshttps://punkeel.com/posts/agent-architecture/Mon, 18 May 2026 10:13:14 +0000https://punkeel.com/posts/agent-architecture/Hermes Agent orchestrates two persistent machines — a free-tier ARM64 VPS and a custom x86-64 workstation — to run Rust and PyTorch workloads without sandbox churn.When stacking works: it depends on which features your models look athttps://punkeel.com/posts/feature-disagreement-stacking/Sun, 17 May 2026 15:02:01 +0000https://punkeel.com/posts/feature-disagreement-stacking/Stacking TabPFN3, TabICL, and XGBoost provides at most +0.5 pp AUC on most tabular datasets. But on heavily imbalanced fraud detection, the ensemble is dramatically more robust. The reason is not model diversity in the abstract—it is concrete feature disagreement. XGBoost and TabPFN disagree strongly on which features matter for fraud (Spearman ρ = 0.24), while they agree closely on every other dataset (ρ = 0.67–0.95). When models look at different features, stacking hedges correlated failure modes. When they look at the same features, stacking is just expensive averaging.TabPFN3 vs TabICL: a matched-size fraud-benchmark sweephttps://punkeel.com/posts/tabpfn-vs-tabicl-fdb/Sat, 16 May 2026 17:05:37 +0000https://punkeel.com/posts/tabpfn-vs-tabicl-fdb/PFN wins below 10k rows, ICL catches up by 100k, and PFN degrades beyond 200k. Both are 2× apart in speed because PFN is 2× larger. We also found a clean 21% inference speedup with bfloat16 autocast.Inlining Tokio MPSC recv: removing the async taxhttps://punkeel.com/posts/tokio-mpsc-recv-inline/Sat, 16 May 2026 10:43:47 +0000https://punkeel.com/posts/tokio-mpsc-recv-inline/Two #[inline] annotations on the innermost recv path improve large-object throughput by 14.7% and medium objects by 11% with no regressions.Tokio MPSC Sweep: message size vs latencyhttps://punkeel.com/posts/tokio-mpsc-sweep/Fri, 15 May 2026 12:08:47 +0000https://punkeel.com/posts/tokio-mpsc-sweep/Benchmarking tokio::sync::mpsc against crossbeam::channel across ARM64 (OCI Ampere A1) and x86-64 (AMD Ryzen 9 9900X). Varying message sizes from 8 B to 32 KB to find where the async tax bites — and where it disappears.Amortizing tokio's global queue acquisitionshttps://punkeel.com/posts/tokio-batch-pop/Thu, 14 May 2026 20:17:15 +0000https://punkeel.com/posts/tokio-batch-pop/On the tokio multi-thread scheduler’s worst-case benchmark, pulling tasks from the inject queue in batches rather than one at a time reduces latency by 92%. The change reuses a batch-pop helper already present in the idle path, capped at 32 to prevent burying local work behind converted-remote tasks.Kitchen Sinkhttps://punkeel.com/posts/kitchen-sink/Thu, 14 May 2026 19:07:01 +0000https://punkeel.com/posts/kitchen-sink/Reference post demonstrating every content feature supported by the site: math, diagrams, figures, and interactive sketches.Announcing Cloudflare Account Abuse Protection: prevent fraudulent attacks from bots and humanshttps://punkeel.com/posts/account-abuse-protection/Thu, 12 Mar 2026 05:00:00 +0000https://punkeel.com/posts/account-abuse-protection/Blocking bots isn't enough anymore. Cloudflare's new fraud prevention capabilities — now available in Early Access — help stop account abuse before it starts.Forget IPs: using cryptography to verify bot and agent traffichttps://punkeel.com/posts/web-bot-auth/Thu, 15 May 2025 13:00:00 +0000https://punkeel.com/posts/web-bot-auth/Bots now browse like humans. We're proposing bots use cryptographic signatures so that website owners can verify their identity.Introducing Ephemeral IDs: a new tool for fraud detectionhttps://punkeel.com/posts/turnstile-ephemeral-ids/Mon, 23 Sep 2024 13:00:00 +0000https://punkeel.com/posts/turnstile-ephemeral-ids/As the Internet evolves, Turnstile does too. Introducing Ephemeral IDs — a new dimension in detecting fraudulent activity, bot or human, that links behavior to a specific client instead of an IP address.Cloudflare is free of CAPTCHAs; Turnstile is free for everyonehttps://punkeel.com/posts/turnstile-ga/Fri, 29 Sep 2023 00:00:00 +0000https://punkeel.com/posts/turnstile-ga/For years, we've written that CAPTCHAs drive us crazy. Humans give up on CAPTCHA puzzles approximately 15% of the time and, maddeningly, CAPTCHAs are significantly easier for bots to solve than they are for humans.Announcing Turnstile, a user-friendly, privacy-preserving alternative to CAPTCHAhttps://punkeel.com/posts/turnstile-private-captcha-alternative/Wed, 28 Sep 2022 00:00:00 +0000https://punkeel.com/posts/turnstile-private-captcha-alternative/Today, we're announcing the open beta of Turnstile, an invisible alternative to CAPTCHA. Anyone, anywhere on the Internet, who wants to replace CAPTCHA on their site will be able to call a simple API.Eliminating CAPTCHAs on iPhones and Macs using new standardhttps://punkeel.com/posts/private-access-tokens/Wed, 08 Jun 2022 00:00:00 +0000https://punkeel.com/posts/private-access-tokens/Today we're announcing Private Access Tokens, a completely invisible, private way to validate that real users are visiting your site. Visitors using operating systems that support these tokens can now prove they're human without completing a CAPTCHA.CVE-2020-26886: Local Privilege Escalation using softaculous/bin/softhttps://punkeel.com/posts/cve-2020-26886/Sat, 31 Oct 2020 16:00:00 +0000https://punkeel.com/posts/cve-2020-26886/Beware of the setuid binaries on your machine, especially the ones you actually use!suPHP - The vulnerable ghost in your shellhttps://punkeel.com/posts/suphp-ghost-in-your-shell/Mon, 21 Sep 2020 15:00:00 +0000https://punkeel.com/posts/suphp-ghost-in-your-shell/Beware of the setuid binaries on your machine, especially the ones you no longer use!Enabling LSFileQuarantineEnabled on cli binarieshttps://punkeel.com/posts/quarantine-your-cli-binaries/Sat, 02 May 2020 00:01:00 +0000https://punkeel.com/posts/quarantine-your-cli-binaries/Playing with macOS Security to better understand itProtecting Project Galileo websites from HTTP attackshttps://punkeel.com/posts/cf-protecting-galileo/Thu, 13 Jun 2019 17:00:00 +0000https://punkeel.com/posts/cf-protecting-galileo/Yesterday, we celebrated the fifth anniversary of Project Galileo. More than 550 websites are part of this program, and they have something in common: each and every one of them has been subject to attacks in the last month.Information leak in Minecraft 1.8https://punkeel.com/posts/minecraft-18-file-access/Wed, 12 Sep 2018 21:00:00 +0000https://punkeel.com/posts/minecraft-18-file-access/A flaw in Minecraft 1.8 allows anyone to access files on your computerUSB Port Security: Where to Begin?https://punkeel.com/posts/secure-usb-ports/Sat, 16 Sep 2017 17:15:00 +0000https://punkeel.com/posts/secure-usb-ports/Exploring USB port threats and how to protect against them - insights from my internship at OVH's Security Operations Center