For every real valued function and every epsilon greater than zero, there’s a neural network (size unbounded) which approximates the function with precision epsilon.
It sounds impressive and, as I understand it, is the basis for the argument that algorithms based on NN’s such as LLM’s will be able to put perform humans at tasks such as programming.
But this theorem contains an ambiguous term that makes it less impressive when you remove it.
Which for me, makes such tools… interesting I guess for some applications but it’s not nearly as impressive as to remove the need for programmers or to replace their labour entirely with automation that we need to concern ourselves with writing markdown files and wasting tokens asking the algorithm to try again.
So this whole argument that, “you better learn to use them or be displaced in the labour market,” is a relying on a weak argument.
I think the distinction without a difference is a tool being deterministic or not. Fundamentally, its nature doesn't matter, if in actual practice it outperforms everything else.
Be that as it may, moving the goalpost aside. For me personally this fundamentally does matter. Programming is about giving instructions for a machine (or something mechanical) to follow. It matters a great deal to me that the machine reliably follows the instructions I give it. And compiler authors of the past have gone to great lengths to make their compilers produce robust (meaning deterministic) output, as have language authors tried to make their standards as rigorous (meaning minimize undefined behavior) as possible.
And for that matter, going back to the band saw analogy, a measure of a quality of a great band saw is, in fact, that the blade won’t snap in half in the middle of a cut. If a band saw manufacturer produces a band saw with a really low binomial p-value (meaning it is less deterministic/more stochastic) that is a pretty lousy band saw, and good carpenters will know to stay away from that brand of band saws.
To me this paints a picture of a distinction that does indeed have a difference. A pretty important difference for that matter.
I think this is a distinction without a difference, we all know what we mean when way say deterministic here.