Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As soon as I read the title, I chuckled, because coming from the computational mathematics background I already knew what it roughly is going to be about. IEEE 754 is like democracy in a sense that it is the worst, except for all the others. Immediately when I saw the example I thought: it is either going to be either a Kahan summation or full-scale computer algebra system. It turned out to be some subset of the latter and I have to admit I have never heard of Recursive Real Arithmetic (I knew of Real Analysis though).

If anything that was a great insight about one of my early C++ heroes, and what they did in their professional life outside of the things they are known for. But most importantly it was a reminder how deep seemingly simple things can be.



IEEE 754 is what you get when you want numbers to have huge dynamic range, equal precision across the range, and fixed bit width. It balances speed and accuracy, and produces a result that is very close to the expected result 99.9999999% of the time. A competent numerical analyst can take something you want to do on paper and build a sequence of operations in floating point that compute that result almost exactly.

I don't think anyone who worked on IEEE 754 (and certainly nobody who currently works on it) contemplated calculators as an application, because a calculator is solving a fundamentally different problem. In a calculator, you can spend 10-100 ms doing one operation and people won't mind. In the applications for which IEEE 754 is made, you are expecting to do billions or trillions of operations per second.


William Kahan worked on both IEEE 754 and HP calculators. The speed gap between something like an 8087 and a calculator was not that big back then, either.


Billions or trillions of ops per second and 1987 don't really go together.



Good point! Side note: Cray-2 did not use IEE 754 floating point.

https://cray-history.net/2021/08/26/cray-floating-point-numb...


Cray did use floating point. It didn't use IEEE standard floating point. Floating point arithmetic is older than the transistor.


Yeah I know. I linked the specs.


Yeah I mean they were surely too old to support it. But the designers of IEEE-754 must have been aware of these systems when they were making the standard.


> equal precision across the range

What? Pretty sure there's more precision in [0-1] than there is in really big numbers.


Precision in numerics is usually considered in relative terms (eg significant figures). Every floating point number has an equal number of bits of precision. It is true, though, that half of the floats are between -1 and 1. That is because precision is equal across the range.


Only the normal floating point numbers have this property, the sub-normals do not.

In the single precision floats for example there is no 0.000000000000000000000000000000000000000000002 it goes straight from 0.000000000000000000000000000000000000000000001 to 0.000000000000000000000000000000000000000000003

So that's not even one whole digit of precision.


Yes, that is true. The subnormal numbers gradually lose precision going towards zero.


Subnormals are a dirty hack to squeeze a bit more breathing space around zero for people who really need it. They aren't even really supported in hardware. Using them in normal contexts is usually an error.


As of 2025, they finally have hardware support from Intel and AMD. IIRC it took until Zen 2 and Ice Lake to do this.


Oh joy! Just in time for all computation to move to GPUs running eight-bit "floats".


IEEE 754 is what you get if you started with the idea of sign, exponent, and fraction and made the most efficient hardware implementation of it possible. It's not "beautiful", but it falls out pretty straightforwardly from those starting assumptions, even the seemingly weirder parts like -0, subnormals and all the rounding modes. It was not really democratically designed, but done by numerical computing experts coupled with hardware design experts. Every "simplified" implementation of floating point that has appeared (e.g. auto-FTZ mode in vector units) has eventually been dragged kicking and screaming back to the IEEE standard.


Another way to see it is that floating point is the logical extension of fixed point math to log space to deal with numbers across a large orders of magnitude. I don't know if "beautiful" is exactly the right word, but it's an incredibly solid bit of engineering.


I feel like your description comes across as more negative on the design of IEEE-754 floats than you intend. Is there something else you think would have been better? Maybe I’m misreading it.

Maybe the hardware focus can be blamed for the large exponents and small mantissas.

The reasonable only non-IEEE things that comes to mind for me are:

- bfloat16 which just works with the most significant half of a float32.

- log8 which is almost all exponent.

I guess in both cases they are about getting more out of available memory bandwidth and the main operation is f32 + x * y -> f32 (ie multiply and accumulate into f32 result).

Maybe they will be (or already are) incorporated into IEEE standards though


Well, I do know some people who really hate subnormals because they are really slow on Intel and kinda slow on Arm. Subnormals I can see being a pain for graphics HW designers. I for one neither love nor hate IEEE 754, other than -0. I have spent far, far too many hours dealing with it. IMHO, it's an encoding artifact masquerading as a feature.


> what you get if you started with the idea of sign, exponent, and fraction and made the most efficient hardware implementation of it possible. It's not "beautiful", but it falls out pretty straightforwardly from those starting assumptions

This implies a strange way of defining what "beautiful" means in this context.


IEEE754 is not great for pure maths, however, it is fine for real life.

In real life, no instrument is going to give you a measurement with the 52 bits of precision a double can offer, and you are probably never going to get quantities are in the 10^1000 range. No actuator is precise enough either. Even single precision is usually above what physical devices can work with. When drawing a pixel on screen, you don't need to know its position down to the subatomic level.

For these real life situations, improving on the usual IEEE 754 arithmetic would probably be better served with interval arithmetic. It would fail at maths, but in exchange you get support for measurement errors.

Of course, in a calculator, precision is important because you don't know if the user is working with real life quantities or is doing abstract maths.


> IEEE754 is not great for pure maths, however, it is fine for real life.

Partially. It can be fine for pretty much any real-life use case. But many naive implementations of formulae involve some gnarly intermediates despite having fairly mundane inputs and outputs.


It becomes a problem when precision errors accumulate in a system, right?

The issue isn't so much that a single calculation is slightly off, it's that many calculations together will be off by a lot at the end.

Is this stupid or..?


> IEEE 754 is like democracy in a sense that it is the worst, except for all the others.

I can't see what would be worse. The entire raison d'etre for computers is to give accurate results. Introducing a math system which is inherently inaccurate to computers cuts against the whole reason they exist! Literally any other math solution seems like it would be better, so long as it produces accurate results.


Sometimes you need a number system which is 1. approximate 2. compact and fast 3. high dynamic range

You’re going to have a hard time doing better than floats with those constraints.


> so long as it produces accurate results

That's doing a lot of work. IEE-754 does very in terms of error vs representation size.

What system has accurate results? I don't know any number system at all in usage that 1) represents numbers with a fixed size 2) Can represent 1/n accurately for reasonable integers 3) can do exponents accurately


Electronic computers were created to be faster and cheaper than a pool of human computers (who may have had slide rules or mechanical adding machines). Human computers were basically doing decimal floating point with limited precision.


There's no "accurate results" most of the time

You can only have a result that's exact enough in your desired precision


It's ideal for engineering calculations which is a common use of computers. There, nobody cares if 1-1=0 exactly or not because you could never have measured those values exactly in the first place. Single precision is good enough for just about any real-world measurement or result while double precision is good for intermediate results without losing accuracy that's visible in the single precision input/output as long as you're not using a numerically unstable algorithm.


Define "accurate"!


Given a computer have finite memory but there are infinitely many real numbers in any range, any system using real numbers will have to use rounding.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: