Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's another approach for doing this: generate a random number between 1 and 2 (they all have the same exponent) and then subtract 1 (that's the default implementation in Julia [0]). But I think it also just gives you 2^53 different numbers.

So, between .5 and 1 you miss out on every second representable number, between 0.25 and .5 you miss out on 3/4 of them, and so on.

I guess for many cases that's good enough, but the article seems like a nice improvement.

ETA: Lemire has some thoughts on this [1] and links to what might be a prior solution [2]. Vigna (of xoroshiro fame) writes about it at the bottom of [3] and also links to [2]. So, presumably the implementation described in the article is faster? ("There have been some past attempts to fix these flaws, but none that avoid a huge performance penalty while doing so."

EDIT2: BTW, one of the things I love about HN (well, the world, really) is that there are people that care deeply that we can uniformly sample floats between 0 and 1 correctly, and all of them, and do it faster.

[0] see https://github.com/JuliaLang/julia/blob/master/stdlib/Random...

  rand(r::AbstractRNG, ::SamplerTrivial{CloseOpen01_64}) 
    = rand(r, CloseOpen12()) - 1.0
[1] https://lemire.me/blog/2017/02/28/how-many-floating-point-nu...

[2] https://mumble.net/~campbell/2014/04/28/uniform-random-float

https://mumble.net/~campbell/2014/04/28/random_real.c

[3] https://prng.di.unimi.it



That is essentially just the fixed point method mentioned here, and it is uniformly distributed across the range (and faster because you do less), but it does not fill the low bits of the number correctly (53 bits of precision != 2^53 of them between 0 and 1). If you look at the final distribution of numbers and you use under 1-10 million numbers, that method and the division method will appear to be unbiased unless you have a pathological case.


That looks kinda the same end result as the fixed point method, but possibly cheaper to execute.


Well, it has to be. The number can only have 53 bits of precision.

There are more than 2^53 floating point values between 0 and 1, but if you use them all, you'll fail to have a uniform distribution. Either certain regions of the space will be far more likely than other regions, or certain values will be far more likely than other values. The article takes the view that the second of those options is desirable, but it's not clear why.

And the algorithm the article describes is:

1. Generate a random float the uniform way.

2. If it's close to zero and has extra precision available, generate some more random bits to fill in the extra precision.

It's always going to be faster to skip step 2. What are we gaining by not skipping it?


Floats are typically used to approximate reals, a uniform distribution over individual float values is usually not what is needed. What is wanted is uniform distribution over equal-sized intervals.

2^53 equally-spaced values within [0, 1) is plenty enough for many use cases, but it's still fundamentally a set of fixed-point rather than floating-point numbers. For a true random floating-point value, all the available precision should be used such that every single value in the domain has some probability mass (except maybe subnormals, but you can usually forget about subnormals). Especially with 32-bit floats it's not that difficult to run out of precision when you start doing math and only have a fixed-point subset of size 2^23 available.


The floats are a surprisingly bad approximation of the reals. They're barely better than the integers, while their shortcomings are much harder to understand, so I'd say it's like if you want approximately a spaceship and you're choosing between an F-14 Tomcat and a Fiat Punto.

Neither of these things is a spaceship, but, it will be obvious to almost anybody why the Punto isn't a spaceship whereas you are less likely to know enough about the F-14 to see why it's not "good enough".


I think the truly surprising thing is just how well floating point numbers work in many practical applications despite how different they are from the real numbers. One could call it the "unreasonable effectiveness of floating point mathematics".


Why is that surprising? No one ever uses real numbers for anything.


There are many situations where you have something you want to compute to within low number of units in the last place, that seem fairly involved, but there are very often clever methods that let you do it without having to go to extended or arbitrary precision. Maybe not that surprising, but it's something I find interesting.


> The floats are a surprisingly bad approximation of the reals. They're barely better than the integers

They are integers. Each float is a pattern of 64 discrete bits. Discretion means integers.


In float32 land, if

    x   = 0.5   (with bit pattern 0x3f000000)
    y   = 0.25  (with bit pattern 0x3e800000)
then

    x+y = 0.75  (with bit pattern 0x3f400000)
But if we just added the bit patterns like integers, we would get 0x7d800000 (with a float value of over 10^37). Just because they are discrete doesn't mean they are integers only that they can be mapped one-to-one with integers. The bit pattern is not the semantic meaning of the value, and you cannot perform correct operations if you ignore the semantics.


> It's always going to be faster to skip step 2. What are we gaining by not skipping it?

Having any number be possible is nice. Having the entire space be reachable when you're sampling a solution space and the solution may be smaller than 1/8. Or having small vector quantities not have their angles ridiculously constrained by quantization.

Most applications don't need it-- indeed, a double has a lot of precision compared to what most people care about, and throwing away some of it at the low end of the randomness range is no big deal. But sometimes, a double is barely (or not even) enough.


> Having the entire space be reachable when you're sampling a solution space and the solution may be smaller than 1/8.

You're out of luck there; no matter what happens you can only generate floating point numbers. If you're worried that the solution lies within the space of floating point numbers, but not within the part that's reachable by generating uniform samples with 1/2^53 resolution, you need to be much, much, much more worried that the solution isn't a floating point number at all.

By contrast, if you're unconcerned with the numeric properties of the numbers you're generating, and what you want is to sample "floating point values" to examine what kind of implementation-specific behavior they might have, you would never want this scheme - you'd want to generate random floating-point values, which you can do by just filling in random bits. Those will be concentrated around zero, which doesn't matter because you're not interpreting them as numbers.


> You're out of luck there; no matter what happens you can only generate floating point numbers. If you're worried that the solution lies within the space of floating point numbers, but not within the part that's reachable by generating uniform samples with 1/2^53 resolution, you need to be much, much, much more worried that the solution isn't a floating point number at all.

2^53 is 15 sigfigs. My answer might be around 10 pico-something and I might need a few digits of precision. Then I've thrown away 11 of my sigfigs and there may not be a numerically stable solution with what's left.

As you point out, double precision often isn't enough. But it's much more likely to not be enough if we throw large parts of the precision away.


> certain values will be far more likely than other values

That's fine, as for the normal understanding of uniformity on the interval, you want the probability that a value is chosen to be proportional to the "space" between its left and right neighbor, and that space is not constant on the interval.

What you try to emulate is picking a real number uniformly between 0 and 1, then returning the closest floating point number (except 1, maybe).


> but if you use them all, you'll fail to have a uniform distribution

Only if you generate them all with equal probability.


Indeed, there's a simple algorithm using interval arithmetic to do so (within bounded time too!).

Think of it as binary subdivision/search of [0,1], using a stream of random bits. Steps:

(1) Divide the current interval in 2 (using its midpoint); (2) Using one random bit, pick one of the 2 intervals; (3) if the picked interval (new current interval) lies entirely on the domain of a single floating point value[def1], stop and return this value, else go to (1).

[def1] The domain associated to a floating point value is the interval between the midpoint between it and its lower neighbor on the left, and the midpoint between it and its higher neighbor on the right.

I expect the performance is very poor, but it does cover all floating point numbers in [0,1] with exactly correct probabilities (assuming the bit probabilities are exactly correct). That's in part because naively you need higher precision arithmetic to do so, as well as up to 53 or so iterations, on average well over 32 I presume.

(I've left out the proof that probabilities are correct, but I believe it's easy to show)


You can start with this idea, and then make it _very_ performant by using bit counting instructions. See https://www.corsix.org/content/higher-quality-random-floats for an exposition.


Did you read the rest of my comment? You can't have a uniform distribution no matter what.


Ah, sorry I didn't see the "or certain values will be far more likely than other values". But I don't understand why that wouldn't fall under the definition of a uniform distribution.

The way I would define a uniform distribution is the following:

For any two floating-point numbers, r1 and r2, which form the range [r1,r2] over the real numbers, and any second pair of floating point numbers s1 and s2, which form a range [s1,s2] over the real numbers, which is contained in [r1,r2]. The probability of getting a result in [s1,s2] when sampling from [r1,r2] must be equivalent to the result of (s2-s1)/(r2-r1) with infinite precision.

This is obviously possible to achieve.


Given a uniform distribution over an interval (α, β) and a tolerance ε, sample the distribution twice. What is the probability that |x_1 - x_2| < ε?


At some point in my career I worked on floating point hardware design, and I agree with your approach.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: