RTX 5090 power scaling: is max wattage just a space heater?

I wanted to know whether maxing out an RTX 5090 FE’s power limit actually buys real speed, or if it just turns your desk into a space heater.

I set up a test bench running Debian 13 on a Ryzen 9 9900X with 64 GB of DDR5.

The workload was a 60 million parameter transformer trained to learn integer addition. That model size is big enough to keep the compute units busy.

I swept the power cap from 400 W to 600 W using nvidia-smi. The card ships at 575 W by default, and the hardware ceiling is 600 W.

To compare configurations, I measured wall time. Here is how each limit performed relative to the 575 W baseline:

The scaling is almost perfectly linear.

Dropping to 400 W stretched the run by 23% compared with the baseline.

Pushing to 600 W only shortened it by 1.8%.

That is a terrible trade.

The 600 W run also consumed 2.4% more total energy than the baseline for that tiny speedup.

You might expect a slower run to waste more energy because the GPU stays powered on longer.

The meter says the opposite.

I measured total energy at the wall. The 400 W run drew 71.0 Wh, while the 575 W baseline drew 82.7 Wh.

Even though the 400 W configuration ran 23% longer, it still used 14% less electricity overall.

I checked GPU utilization and it stayed pinned at 99% across every single test.

The model was large enough to saturate the silicon regardless of the power cap.

The card simply clocks down proportionally as you reduce the wattage.

I also monitored for thermal throttling.

It never showed up.

In an open-air case with three 140mm fans, the card stayed well below its thermal ceiling even at 600 W.

Speed was gated by the power budget, not by heat.

For a home workstation, the useful sweet spot sits between 475 W and 500 W.

At 500 W, you give up only 7% in wall time compared with the default 575 W.

At 475 W, the loss is about 11%.

If the machine idles 80% of the year, the yearly cash savings are small — roughly €26 to €34.

The bigger win is the extra thermal safety margin in a residential build.

For most of us, the “max performance” setting is basically a space heater with extra steps.

Does this linear scaling hold for larger models, or does the memory bottleneck change the math?

Method	What it means
nvidia-smi	NVIDIA’s command-line tool used to set GPU power limits and read telemetry.
wall time	The elapsed clock time from the start of a training run to its finish.
GPU utilization	The percentage of time that the GPU’s compute units are actively processing.
thermal throttling	Automatic clock reduction triggered when a chip exceeds its safe temperature limit.

This post was researched and drafted with AI assistance. I picked the ideas, reviewed every claim, and verified all benchmarks.