POC, sure (although 10x-ing a POC doesn't actually get you 10x velocity). MVP, though? No way. Today's frontier models are nowhere near smart enough to write a non-trivial product (i.e. something that others are meant to use), minimal or otherwise, without careful supervision. Anthropic weren't able to get agents to write even a usable C compiler (not a huge deal to begin with), even with a practically infeasible amount of preparatory work (write a full spec and a reference implementation, train the model on them as well as on relevant textbooks, write thousands of tests). The agents just make too many critical architectural mistakes that pretty much guarantee you won't be able to evolve the product for long, with or without their help. The software they write has an evolution horizon between zero days and about a year, after which the codebase is effectively bricked.
There is a million things in between a C compiler and a non-trivial product. They do make a ton of horrible architectural decisions, but I only need to review the output/ask questions to guide that, not review every diff.
A C compiler is a 10-50KLOC job, which the agents bricked in 0 days despite a full spec and thousands of hand-written tests, tests that the software passed until it collapsed beyond saving. Yes, smaller products will survive longer, but how would you know about the time bombs that agents like hiding in their code without looking? When I review the diffs I see things that, if had let in, the codebase would have died in 6-18 months.
BTW, one tip is to look at the size of the codebase. When you see 100KLOC for a first draft of a C compiler, you know something has gone horribly wrong. I would suggest that you at least compare the number of lines the agent produced to what you think the project should take. If it's more than double, the code is in serious, serious trouble. If it's in the <1.5x range, there's a chance it could be saved.
Asking the agent questions is good - as an aid to a review, not as a substitute. The agents lie with a high enough frequency to be a serious problem.
The models don't yet write code anywhere near human quality, so they require much closer supervision than a human programmer.
A C compiler with an existing C compiler as oracle, existing C compilers in the training set, and a formal spec, is already the easiest possible non-trivial product an agent could build without human review.
You could have it build something that takes fewer lines of code, but you aren’t gonna to find much with that level of specification and guardrails.
AI won't kill apps, it will just change who 'clicks' the buttons. Even the most powerful AI needs a source of truth and a structured environment to pull data from. A world without websites is a world where AI has nothing to read and nowhere to execute. We aren’t deleting the UI. We’re just building the backends that feed the agents.
I want it yes. I already feel like Im the one doing the dumb work for the AI of manually clicking windows and typing in a command here or there it cant do.
Ive also been getting increasingly annoyed with how tedious it is to do the same repetitive actions for simple tasks.
Most books have so much nonsense details that I cant help but skip most of it.
On the other hand technical books can be so overwhelmingly difficult that you need to go outside and do hours of learning to understand one tidbit of it
depending on how large your codebase is, hopefully not. At this point use something like the IX plugin to ingest codebase and track context, rather than from the LLM itself.
- naiveTokens = 19.4M — what ix estimates it would have cost to answer your queries without graph intelligence (i.e., dumping full files/directories into context)
- actualTokens = 4.7M — what ix's targeted, graph-aware responses actually used
- tokensSaved = 14.7M — the difference
I mean whatever part of the code that is read by the AI has to be in the content window at some point or another nSprewd throughout your sessions Id think even with a huge codebase, 90% of it is going to be there
Ive been noticing something similar recently. If somethings not working out itll be like "Ok this isnt working out, lets just switch to doing this other thing instead you explicitly said not to do".
For example I wanted to get VNC working with PopOS Cosmic and itll be like ah its ok well just install sway and thatll work!
Experienced this -- was repeatedly directing CC to use Claude in Chrome extension to interact with a webpage and it was repeatedly invoking Playwright MCP instead.
I actually submitted an upstream patch for Cosmic-Comp thanks to Claude on Saturday. I wanted to play Guild Wars remake and something was going on with the mouse and moving the camera. We had it fixed in no time and now shit is working great.
Id highly disagree with that. Were all living in the same shared universe, and underlying every intelligence must be precisely an understanding of events happening in this space-time.
No I am saying the basis of intelligence must be shared, not that we have the same exact mental model.
I might for example say a human entered a building, a bat might on the other hand think "some big block with two sticks moved through a hole", but both are experiencing a shared physical observation, and there is some mapping between the two.
Its like when people say, if there are aliens they would find the same mathematical constants thet we do
reply