# The Exascale Computing Benchmark

# NetPIPE

Flops are irrelevant. Data matters, and data movement matters. Let's imagine
for a minute that we had a perfect quantum computing FPU that could execute
25 exaflops, took up 1 cubic millimeter, and used brownian motion to power
it. We'd have all the Exascale computing issues solved, and the chemists,
physicists, and climate modelers would start battling it out for time on
the machine, right?
Not so fast... To get a human the result of those 25 exaflops, we have to interact
with and observe the results. In a pretty much classical newtonian mechanics macro
-world in which as far as we can tell information transfer is accomplished with
discrete digital electrical or optical signals.

Well, if we take a step up out of CS theory, or maybe just go back to the turing
machine, we observe that those electrical and optical signals take energy in the
transmitter.. watts = current x voltage, and dissapate some of that heat in the
transmission media, and in the receiver. So really, the natural benchmark of our
exaflop quantum FPU is not in flops, but in bits/second. But if we are talking
about an exaflop in 1 cubic millimeter, we best be go find some mechanical engineers
who know practical stuff about thermodynamics and heat transfer, because the surface
of this cubic millimeter exaflop is going to be lit up like a quasar getting the data
in and out. But I don't need to go all sci-fi on you to prove a point.. The latest
processors from AMD and Intel modulate the CPU clock to stay within the thermal
envelope of the silicon package. Someone's going to bring up GPUs, but we have many
other practical problems getting data in and out of GPUs. And the fast ones run
really hot. I will bet you that the computing *system* that has the best
Bits/Joule is the one that will have the highest Bits/second (measured at the FPU),
and in in turn, the highest machoFLOPS.

So let's just forget the machoFLOPS, and evaluate the system on Bits per Joule for
effiency, and bits per second for peak capability... Hrrm.. What benchmark should
we use? Might I offer a suggestion? Or write a few
other ones. Or just report FLOPS in terms of bits/joule. Or exaflop/megawatt-hour
or something.

## The bitcoin corellary

SHA-256 hashing performance (Gigahash/sec, Gigahash/watt, and Hash/Joule) are limited
not by process technology, but by the IR drop of the circuit board, pad mounts, solder,
interposer, wire bonds, and silicon metal layers between the power converter and the
active transistors. Once you get to around 40-28nm process tech, around half the power
gets dissipated in the electron traffic jam on the way to the transistors. (This is
based on a completely non-scientific gut reaction to the way a bunch of bitcoin miner
asic/system vendors take some estimate on theoretical performance from their chip
design and then end up halving it because they forgot to calculate 15 levels of IR
losses.) More scientific results to come when efabless
makes me an open-source hardware bitcoin ASIC chip.

## Some interesting links on power densities

- 600kw/m^3
ITER fusion reactor
- 20kw/m^3 (200kw/20m^3) Toshiba Micro Nuclear
- approx 100Mw/m^3 in a nuclear reactor (784MWe * 3 = 2353MWheat / 2.5*2.5*3.7)
- approx 100Mw/m^3 in a 1cm x 1cm x 1mm active silicon dissapating 10 watts

### Other things written by Troy