More

zer00eyz · 2026-03-28T18:29:56 1774722596

Too bad that author, and committer are individuals and not lists. It would be good to see who wrote them and how the voting went as well.

zer00eyz · 2026-03-28T17:05:55 1774717555

> This needs to be studied against people in real life who have a social contract of some sort... IME, LLMs will shoot holes in your ideas and it will efficiently do so.

The Krafton / Subnatuica 2 lawsuit paints a very different picture. Because "ignored legal advice" and "followed the LLM" was a choice. Do you think someone who has conversation where "conviction" and "feelings" are the arbiters of choice are going to buy into the LLM push back, or push it to give a contrived outcome?

The LLM lacks will, it's more or less a debate team member and can be pushed into arguing any stance you want it to take.

zer00eyz · 2026-03-28T16:47:01 1774716421

> I think this may be related to how people read code. You have people who scan shapes, and then you have people who read code almost like prose.

I think this is an astute observation.

I think there is another category of "reading" that happens, is what you're reading for "interaction" or "isolation".

Sure c.method is a scalable shape but if your system deals with Cats, Camels, Cars, and Crabs that same c.method when dealing with an abstract api call divorced from the underlying representation might not be as helpful.

I would think that we would have more and better research on this, but the only paper I could find was this: https://arxiv.org/pdf/2110.00785 its a meta analysis of 57 other papers, a decent primer but nothing ground breaking here.

> I scan shapes. ... verbal description.

I would be curious if you frequently use a debugger? Because I tend to find the latter style much more useful (descriptive) in that context.

bborud · 2026-03-28T17:19:29 1774718369

dealing with an abstract api call divorced from the underlying representation

I don't understand what you mean. Could you give me an example?

I would be curious if you frequently use a debugger?

I practically never use a debugger.

functional_dev · 2026-03-28T17:30:07 1774719007

The shape argument works well in small packages but it starts to fail once you have multiple domain models starting with the same letter

bborud · 2026-03-28T21:21:12 1774732872

I wasn’t talking about just symbols but entire paragraphs of code as well.

zer00eyz · 2026-03-28T16:18:12 1774714692

You arent wrong, but it is not an absolute.

Furniture maker, house framer, finish carpenter are all under the category of woodworking, but these jobs are not the same. Years of honed skill in tool use makes working in the other categories possible, but quality and productivity will suffer.

Does working in JS, on the front end teach you how to code, it sure does. So does working in an embedded system. But these jobs might be further apart than any of the ones I highlighted in the previous category.

There are plenty of combinations of systems and languages where your rule about a screen just isn't going to apply. There are plenty of problems that make scenarios where "ugly loops" are a reality.

bborud · 2026-03-28T16:51:17 1774716677

I didn't say it was an absolute. But once a scope grows to the point where you have to navigate to absorb a function or a loop, both readability and complexity tends to worsen. As does your mental processing time. Especially for people who "scan" code rapidly rather than reading it.

The slower "readers" will probably not mind as much.

This is why things like function size is usually part of coding standards at a company or on a project. (Look at Google, Linux etc)

zer00eyz · 2026-03-27T05:40:08 1774590008

On top of that there are probably a few more hits for the containers, vm and hypervisor, all those pods have monitoring etc. All the layers of abstraction are just stacks of turtles giving the illusion of being easier but adding complexity and cost/overhead.

It is a security product, so unless they want to deal with the exfiltration charges on the data it's probably better to keep it in AWS. Thats the nasty double edge sword of "cloud", and how we're all getting locked in.

All the bits on their own seem to make perfect sense, but it's become apparent that the orchestra has been blind folded and given noise canceling head phones.

zer00eyz · 2026-03-27T05:15:25 1774588525

And to market their AI security product.

zer00eyz · 2026-03-27T02:49:38 1774579778

The above comment needs to be higher.

IF we had a black box programing language, and handed it over to this system, it would never be able to do anything with it past its context window.

Hey kids I hear you like agents, so we made an agent write agents till we got better agents.

zer00eyz · 2026-03-27T01:59:18 1774576758

> something competitive with Nvidia for AI training

Apple is counting on something else: model shrink. Every one is now looking at "how do we make these smaller".

At some point a beefy Mac Studio and the "right sized" model is going to be what people want. Apple dumped a 4 pack of them in the hands of a lot of tech influencers a few months back and they were fairly interesting (expensive tho).

JumpCrisscross · 2026-03-27T02:55:26 1774580126

> Apple is counting on something else: model shrink

The most powerful AI interactions I've had involved giving a model a task and then fucking off. At that point, I don't actually care if it takes 5 minutes or an hour. I've cued up a list of background tasks it can work on, and that I can circle back to when I have time. In that context, smaller isn't even the virtue at hand–user patience is. Having a machine that works on my bullshit questions and modelling projects at one tenth the speed of a datacentre could still work out to being a good deal even before considering the privacy and lock-in problems.

jiggawatts · 2026-03-27T07:00:51 1774594851

What "tooling" do you use to let AIs work unattended for long periods?

JumpCrisscross · 2026-03-27T22:26:17 1774650377

> What "tooling" do you use to let AIs work unattended for long periods?

Claude and Kagi Assistant. I tried tooling up a multi-model environment in Ollama and it was annoying. It's just searching the web, building models and then running a test suite against the model to refine it.

raincole · 2026-03-27T05:53:52 1774590832

Cool? And it has nothing to do with what kind of consumer hardware Apple should sell. If your use cases are literally "bigger model better" then the you should always use cloud. No matter how much computing power Apple squeezes into their device it won't be a mighty data center.

gizajob · 2026-03-27T07:28:51 1774596531

For running the model once it’s been trained, all a datacenter does is give you lower latency. Once the devices have a large enough memory to host the model locally, then the need to pay datacenter bills is going to be questioned. I’d rather run OpenClaw on my device plugged into a local LLM rather than rely on OpenAI or Claude.

root_axis · 2026-03-27T03:28:32 1774582112

> At some point a beefy Mac Studio and the "right sized" model is going to be what people want.

It's pretty clear that this isn't going to happen any time soon, if ever. You can't shrink the models without destroying their coherence, and this is a consistently robust observation across the board.

sipjca · 2026-03-27T03:47:50 1774583270

I don’t think it’s about literally shrinking the models via quantization, but rather training smaller/more efficient models from scratch

Smaller models have gotten much more powerful the last 2 years. Qwen 3.5 is one example of this. The cost/compute requirements of running the same level intelligence is going down

HerbManic · 2026-03-27T05:19:46 1774588786

I have said for a while that we need a sort of big-little-big model situation.

The inputs are parsed with a large LLM. This gets passed on to a smaller hyper specific model. That outputs to a large LLM to make it readable.

Essentially you can blend two model type. Probabilistic Input > Deterministic function > Probabilistic Output. Have multiple little determainistic models that are choose for specific tasks. Now all of this is VERY easy to say, and VERY difficult to do.

But if it could be done, it would basically shrink all the models needed. Don't need a huge input/output model if it is more of an interpreter.

root_axis · 2026-03-27T13:54:37 1774619677

There are no practically useful small models, including Qwen 3.5. Yes, the small models of today are a lot more interesting than the small models of 2 years ago, but they remain broadly incoherent beyond demos and tinkering.

kyboren · 2026-03-27T05:14:42 1774588482

Yes, but bigger models are still more capable. Models shrinking (iso-performance) just means that people will train and use more capable models with a longer context.

sipjca · 2026-03-27T07:35:44 1774596944

Of course they are! Both are important and will be around and used for different reasons

Forgeties79 · 2026-03-27T02:05:57 1774577157

Cheaper than what you’d expect though. You could get a nice setup for $20-40k 6mo ago. As far as enterprise investments go, that’s a rounding error.

a1o · 2026-03-27T03:30:03 1774582203

Not all enterprises are the same, I imagine many companies have different departments working with local optimums, so someone who could benefit from it to get more productivity might not have access to it because the department that is doing hardware acquisition is being measured in isolation.

Forgeties79 · 2026-03-27T14:47:42 1774622862

I think it’s a little unnecessary to lecture somebody on HN about how enterprises come in different shapes and sizes. It’s pretty clear what I’m implying here if you aren’t actively trying to assume the most reduced, least charitable version of my statement.

zer00eyz · 2026-03-27T03:00:51 1774580451

Drop that down to 5k, and make it useful.

Give every iPhone family a in house Siri that will deal with canceling services and pursuing refunds.

Your customer screw up results in your site getting an agent drive DDOS on its CS department till you give in.

Siri: "Hey User, here's your daily update, I see you haven't been to the gym, would you like me to harass their customer service department till they let you out of their onerous contract?"

Forgeties79 · 2026-03-27T14:55:33 1774623333

I’m running modest setup using a mistral model (24B) on a 9070 (AMD) and 32gb of ram. $1800 machine at the time I built it. It ultimately boils down to what you want to do with it. For me, it’s basically a drafting tool. I use it to break through writer’s block, iterate, or just throw out some ideas. Sometimes summarize but that can be hit or misss.

I don’t need the latest and greatest and I fine tuned LM studio enough that I get acceptable results in 30 to 90 seconds that help me keep moving ahead. I am not a software engineer, I am definitely not as much of a “coder” as the average person on HN. So if I can do it for less than $2000, I bet a lot of (smarter/experience coding) people could see great results for $5000.

You can get an M3 ultra Mac studio with 96gb ram for $4000. If you’re willing to go up to $6k it’s 256gb. Wayyyyy more firepower than my setup. I imagine plenty powerful for a lot of people.

zer00eyz · 2026-03-27T00:15:07 1774570507

Next up:

Spend 1.99 and get a chest full of Anthropic emeralds, that you can redemem for Claude Chests, and a chance at winning a million more tokens.

Or watch this 3 minute ad, for 1000 tokens.

I did not think this day would come this soon, but I assure you that anthropic has no moat.

zer00eyz · 2026-03-26T16:16:24 1774541784

> if you've ever tried telling a toddler "no"

Parenting is rough! Good for you, for sticking to your guns.

> The plaintiff, Kaley, started using YouTube at age 6 and Instagram at 11.

Who was at the wheel here? If we call up all Kaleys teachers from this time frame and ask them "were Kaleys parents checked out" what do you think the answer would be? For as bad as education has gotten, I sympathize with with teachers because parents have gotten FAR worse.

It's not like we don't know these things about peoples behavior on devices... maybe it's something that should be talked about in school, along with how credit works, and how to file taxes.

Do we need to tell parents "it's 10am, have your kids touched grass yet?"... "It's 10pm did you take the tablet and phone away so they go the fuck to sleep?" --

"touch grass" as a meme/slang is literally people poking fun at the constantly on line. It's "hazing" and "bullying" to drive social correction.