My take on the difference between now and then is “effort”. All those things men...

hnlmorg · 2026-02-07T22:52:18 1770504738

I’ve worked in the AI space and I understand how LLMs work as a principle. But we don’t know the magic contained within a model after it’s been trained. We understand how to design a model, and how models work at a theoretical level. But we cannot know how well it will be at inference until we test it. So much of AI research is just trial and error with different dials repeated tweaked until we get something desirable. So no, we don’t understand these models in the same way we might understand how an hashing algorithm works. Or a compression routine. Or an encryption cypher. Or any other hand-programmed algorithm.

I also run Linux. But that doesn’t change how the two major platforms behave and that, as software developers, we have to support those platforms.

Open source hardware is great but it’s not on the same league of price and performance as proprietary hardware.

Agentic AI doesn’t make me feel hopeless either. I’m just describing what I’d personally define as a “golden age of computing”.

bhadass · 2026-02-08T03:03:17 1770519797

but isn't this like a lot of other CS-related "gradient descent"?

when someone invents a new scheduling algorithm or a new concurrent data structure, it's usually based on hunches and empirical results (benchmarks) too. nobody sits down and mathematically proves their new linux scheduler is optimal before shipping it. they test it against representative workloads and see if there is uplift.

we understand transformer architectures at the same theoretical level we understand most complex systems. we know the principles, we have solid intuitions about why certain things work, but the emergent behavior of any sufficiently complex system isn't fully predictable from first principles.

that's true of operating systems, distributed databases, and most software above a certain complexity threshold.

daveguy · 2026-02-08T16:41:11 1770568871

No. Algorithm analysis is much more sophisticated and well defined than that. Most algorithms are deterministic, and it is relatively straightforward to identify complexity, O(). Even nondeterministic algorithms we can evaluate asymptotic performance under different categories of input. We know a lot about how an algorithm will perform under a wide variety of input distributions regardless of determinism. In the case of schedulers, and other critical concurrency algorithms, performance is well known before release. There is a whole subfield of computer science dedicated to it. You don't have to "prove optimality" to know a lot about how an algorithm will perform. What's missing in neural networks is the why and how any inputs will propagate, through the network during inference. It is a black box of understandability. Under a great deal of study, but still very poorly understood.

bhadass · 2026-02-08T22:29:05 1770589745

i agree w/ the the complexity analysis point, but that theoretical understanding actually translates to real world deployment decisions in both subfields. knowing an algorithm is O() tells you surprisingly little about whether itll actually outperform alternatives on real hardware with real cache hierarchies, branch predictors, and memory access patterns. same thing with ML (just with the very different nature of GPU hw), both subfields hve massive graveyards of "improvements" that looked great on paper (or in controlled environments) but never made it into production systems. arxiv is full of architecture tweaks showing SOTA on some benchmark and the same w/ novels data structures/algorithms that nobody ever uses at scale.

daveguy · 2026-02-09T00:02:20 1770595340

I think you missed the point. Proving something is optimal, is a much higher bar than just knowing how the hell the algorithm gets from inputs to outputs in a reasonable way. Even concurrent systems and algorithm bounds under input distributions have well established ways to evaluate them. There is literally no theoretical framework for how a neural network churns out answers from inputs, other than the most fundamental "matrix algebra". Big O, Theta, Omega, and asymptotic performance are all sound theoretical methods to evaluate algorithms. We don't have anything even that good for neural networks.

coldtea · 2026-02-08T15:01:59 1770562919

>Those little black boxes of AI can be significantly demystified by, for example, watching a bunch of videos (https://karpathy.ai/zero-to-hero.html) and spending at least 40 hours of hard cognitive effort learning about it yourself.

That's like saying you can understand humans by watching some physics or biology videos.

AndrewKemendo · 2026-02-08T16:58:42 1770569922

No it’s not

Nobody has built a human so we don’t know how they work

We know exactly how LLM technology works

abustamam · 2026-02-08T17:26:50 1770571610

We know _how_ it works but even Anthropic routinely does research on its own models and gets surprised

> We were often surprised by what we saw in the model

https://www.anthropic.com/research/tracing-thoughts-language...

AndrewKemendo · 2026-02-08T17:56:27 1770573387

Which is…true of all technologies since forever

hnlmorg · 2026-02-08T19:42:05 1770579725

Except it's not. Traditional algorithms are well understood because they're deterministic formulas. We know what the output is if we know the input. The surprises that happen with traditional algorithms are when they're applied in non-traditional scenarios as an experiment.

Whereas with LLMs, we get surprised even when using them in an expected way. This is why so much research happens investigating how these models work even after they've been released to the public. And it's also why prompt engineering can feel like black magic.

AndrewKemendo · 2026-02-08T20:35:34 1770582934

I don’t know what to tell you other than to say that the concept of determinism in engineering is extremely new

Everything you said right now holds equally true for chemical engineering and biomedical engineering so like you need get some experience

abustamam · 2026-02-08T23:45:17 1770594317

I think the historical record pushes back pretty strongly on the idea that determinism in engineering is new. Early computing basically depended on it. Take the Apollo guidance software in the 60s. Those engineers absolutely could not afford "surprising" runtime behavior. They designed systems where the same inputs reliably produced the same outputs because human lives depended on it.

That doesn't mean complex systems never behaved unexpectedly, but the engineering goal was explicit determinism wherever possible: predictable execution, bounded failure modes, reproducible debugging. That tradition carried through operating systems, compilers, finance software, avionics, etc.

What is newer is our comfort with probabilistic or emergent systems, especially in AI/ML. LLMs are deterministic mathematically, but in practice they behave probabilistically from a user perspective, which makes them feel different from classical algorithms.

So I'd frame it less as "determinism is new" and more as "we're now building more systems where strict determinism isn't always the primary goal."

Going back to the original point, getting educated on LLMs will help you demystify some of the non-determinism but as I mentioned in a previous comment, even the people who literally built the LLMs get surprised by the behavior of their own software.

bluedel · 2026-02-09T08:49:10 1770626950

I refuse to believe you sincerely think this is a salient point. Determinism was one of the fundamental axioms of software engineering.

hnlmorg · 2026-02-08T20:41:45 1770583305

That’s some epic goal post shifting going on there!!

We’re talking about software algorithms. Chemical and biomedical engineering are entirely different fields. As are psychology, gardening, and morris dancing

AndrewKemendo · 2026-02-08T22:15:15 1770588915

I said all technologies

hnlmorg · 2026-02-08T23:21:10 1770592870

Yeah. Which any normal person would take to mean “all technologies in software engineering” because talking about any other unrelated field would just be silly.

soulofmischief · 2026-02-08T20:00:13 1770580813

We know why they work, but not how. SotA models are an empirical goldmine, we are learning a lot about how information and intelligence organize themselves under various constraints. This is why there are new papers published every single day which further explore the capabilities and inner-workings of these models.

AndrewKemendo · 2026-02-08T20:45:29 1770583529

You can look at the weights and traces all you like with telemetry and tracing

If you don’t own the model then you have a problem that has nothing to do with technology

soulofmischief · 2026-02-09T11:34:54 1770636894

Ok, but the art and science of understanding what we're even looking at is actively being developed. What I said stands, we are still learning the how. Things like circuits, dependencies, grokking, etc.