More

e12e · 2026-04-25T00:27:53 1777076873

Interesting - but how do I patch, upgrade and build my own iso?

The source repository isn't very enlightening?

> The actual repository here hosts the source code for Lightwhale, and is not of any interest for most people.

> https://bitbucket.org/asklandd/lightwhale/src/master/

alex14fr · 2026-04-25T14:24:07 1777127047

It appears to be outdated (last commit from 2 years ago), and version 3.0 seems not to be there.

e12e · 2026-04-24T16:16:39 1777047399

> this is a great example of a driver that should have been running in userspace a long time ago, just like how Windows has been moving in that direction.

Hasn't windows (nt lineage) moved solidly in the opposite direction? Used to be you could reload/restart the video card ("GPU") driver if the driver crashed?

halter73 · 2026-04-24T16:31:18 1777048278

I think this conflates two different eras/layers. NT 4 famously moved the window manager/GDI/graphics subsystem into kernel mode, so that’s probably the “opposite direction” history. But modern GPU-driver recovery is WDDM/TDR, and it very much still exists: WDDM splits the display driver into user-mode and kernel-mode components, and TDR resets/recovers a hung GPU/driver instead of requiring a reboot.

https://learn.microsoft.com/en-us/windows-hardware/drivers/d... https://learn.microsoft.com/en-us/windows-hardware/drivers/d...

I also update NVIDIA drivers regularly on Windows 11 without rebooting, though that’s install-time driver reload rather than exactly the same thing as TDR.

jasomill · 2026-04-25T01:13:01 1777079581

And I don't recall any practical way to recover from a crashed NT 3.x GUI subsystem.

It would presumably still function as a server, and be gracefully shut down remotely, but in the absence of anything like Remote Desktop or EMS[1], you'd be hard pressed to get much local troubleshooting done without rebooting the system anyway.

Also, as an NT user since 3.1, and a daily user of 3.5 and 3.51, I don't recall the GUI ever actually hanging or crashing (other than as a side effect of a bugcheck, which, by definition, is a crash triggered by code running in kernel mode).

That's one of the main reasons I was an early and enthusiastic NT user: while I can't say its performance was any better than "good enough", and then only on hardware that was at least comfortably above average in terms of CPU speed and RAM capacity, it was remarkably stable compared to every other PC OS I had used at the time.

Which, to be fair, would have been limited to MS-DOS, 16-bit Windows 2.x and 3.x, and OS/2 2.0 at the time, though it remained true throughout the lifespan of Windows 9x and OS/2 (at least through 3.0, the last version I used), and neither FreeBSD nor Linux were as reliable once you added at least a basic X11 environment to reach rough feature parity (and while X11 did allow recovery after crashing, insofar as it can be restarted without rebooting the system, it still took all your GUI applications and xterm windows down with it when it crashed).

[1] https://en.wikipedia.org/wiki/Emergency_Management_Services

doctorpangloss · 2026-04-24T16:26:21 1777047981

No it's the opposite. WDDM and DirectX are constantly being updated and have been improving crash recovery of the GPU, updating its driver, power management, abstracting features like video encoding and storage DMA, among many things. In Linux it is taking ages, the first proposal for DRM to support 2010 era WDDM features was in 2021 and it still does not exist. Graphics is one of the few places some of Microsoft still innovates. Although not in the sense of having great code, they just put in the work to coordinate these changes from the handful of vendors. If only someone hosted more steak dinners for Linux.

e12e · 2026-04-22T14:58:48 1776869928

I think there's room for a distinction between "not using metrics" and "not using data".

Unthinkingly leaning on metrics is likely to help you build a faster, stronger horse, while at the same time avoiding building a car, a bus or a tractor.

e12e · 2026-04-22T14:32:07 1776868327

Posting inspired by this tweet on x:

>> I was in Ukraine drone HQ last year and they were using Palantir tech to blow up Russian tanks dude

> And I am a Ukrainian drone pilot on the frontline. We use the Delta Battlefield Management System, fully developed in Ukraine. Not American Peter “Antichrist” Thiel bullshit.

https://x.com/laser_kiwi_ua/status/2046446354558251100

e12e · 2026-04-21T17:03:43 1776791023

The post in question (also linked in TFA):

https://x.com/PalantirTech/status/2045574398573453312

e12e · 2026-04-20T17:24:22 1776705862

Not really comparable perhaps - but I had a Ericsson t18s or similar that went through a full 60C cotton wash cycle (being on at the start of the wash) and was fine after drying off.

The thing is - if the battery had been destroyed, that could have been replaced...

e12e · 2026-04-20T17:12:26 1776705146

I think it might have to do with how models work, and fundamental limits with them (yes, they're stochastic parrots, yes they confabulate).

Newer (past two years?) models have improved "in detail" - or as pragmatic tools - but they still don't deserve the anthropomorphism we subject them to because they appear to communicate like us (and therefore appear to think and reason, like us).

But the "holes" are painted over in contemporary models - via training, system prompts and various clever (useful!) techniques.

But I think this leads us to have great difficulty spotting the weak spots in a new, or slightly different model - but as we get to know each particular tool - each model - we get better at spotting the holes on that model.

Maybe it's poorly chosen variable names. A tendency to write plausible looking, plausibly named, e2e tests that turns out to not quite test what they appear to test at first glance. Maybe there's missing locking of resources, use of transactions, in sequencial code that appear sound - but end up storing invalid data when one or several steps fail...

In happy cases current LLMs function like well-intentioned junior coders enthusiasticly delivering features and fixing bugs.

But in the other cases, they are like patholically lying sociopaths telling you anything you want to hear, just so you keep paying them money.

When you catch them lying, it feels a bit like a betrayal. But the parrot is just tapping the bell, so you'll keep feeding it peanuts.

e12e · 2026-04-14T19:09:33 1776193773

> for static API keys, the backend injects the credential directly into the agent's runtime environment.

What prevents the agent from presisering or leaking the API key - or reading it from the environment?

e12e · 2026-04-13T14:54:23 1776092063

> Meta CEO Mark Zuckerberg could soon have an AI clone of himself to interact with and provide feedback to employees, according to a report from the Financial Times.

https://www.ft.com/content/02107c23-6c7a-4c19-b8e2-b45f4bb9c...

https://archive.is/mtVXJ

e12e · 2026-04-11T17:40:39 1775929239

Tunnel vision? If your model can handle big context, why divide into lesser problems to conquer - even if such splitting might be quite trivial and obvious?

It's the difference of "achieve the goal", and "achieve the goal in this one particular way" (leverage large context).

wat10000 · 2026-04-11T17:54:47 1775930087

I meant, if the claim here is that small models can accomplish the same things with good scaffolding, why didn’t they demonstrate finding those problem with good scaffolding rather than directly pointing them at the problem?

mattmanser · 2026-04-11T21:30:03 1775943003

They don't have to.

Lot of people in this thread don't seem to be getting that.

If another model can find the vulnerability if you point it at the right place, it would also find the vulnerability if you scanned each place individually.

People are talking about false positives, but that also doesn't matter. Again, they're not thinking it through.

False positives don't matter, as you can just automatically try and exploit the "exploit" and if it doesn't work, it's a false positive.

Worse, we have no idea how Mythos actually worked, it could have done the process I've outlined above, "found" 1,000s of false positives and just got rid of them by checking them.

The fundamental point is it doesn't matter how the cheap models identified the exploit, it's that they can identify the exploit.

When it turns out the harness is just acting as a glorified for-each brute force, it's not the model being intelligent, it's simply the harness covering more ground. It's millions of monkeys bashing type-writers, not Shakespeare at one.

wat10000 · 2026-04-12T01:19:49 1775956789

It’s strange to see this constant “I could do that too, I just don’t want to” response.

Finding an important decades-old vulnerability in OpenBSD is extremely impressive. That’s the sort of thing anyone would be proud to put on their resume. Small models are available for anyone to use. Scaffolding isn’t that hard to build. So why didn’t someone use this technique to find this vulnerability and make some headlines before Anthropic did? Either this technique with small models doesn’t actually work, or it does work but nobody’s out there trying it for some reason. I find the second possibility a lot less plausible than the first.

cycomanic · 2026-04-12T07:51:56 1775980316

From the article: >At AISLE, we've been running a discovery and remediation system against live targets since mid-2025: 15 CVEs in OpenSSL (including 12 out of 12 in a single security release, with bugs dating back 25+ years and a CVSS 9.8 Critical), 5 CVEs in curl, over 180 externally validated CVEs across 30+ projects spanning deep infrastructure, cryptography, middleware, and the application layer.

They have been doing it (and likely others as well), but they are not anthropic which a million dollar marketing budget and a trillion dollar hype behind it, so you just didn't hear about it.

roywiggins · 2026-04-12T14:25:10 1776003910

They could have linked their replication in this blog post, which we did all see, if they have one.

mattmanser · 2026-04-12T07:58:04 1775980684

Why are you EXTREMELY impressed? The level of hysteria and lack of objective thought by pro-AI people on this thread is extremely concerning.

Vulnerabilities are found every day. More will be found.

They claim they spent $20k finding one, probably more like $20 million if you actually dug into it.

And if you took into account inference, more like $2 billion.

The reason why no-one's done it is because it's not worth the money in tokens to do so.

LordDragonfang · 2026-04-12T00:03:35 1775952215

> If another model can find the vulnerability if you point it at the right place, it would also find the vulnerability if you scanned each place individually.

They didn't just point it at the right place, they pointed it at the right place and gave it hints. That's a huge difference, even for humans.