I don’t think that the real dichotomy here. You can either produce 2-5x good maintainable code, or 10-50x more dogshit code that works 80-90% of the time, and that will be a maintenance nightmare.
The management has decided that the latter is preferable for short term gains.
> You can either produce 2-5x good maintainable code, or 10-50x more dogshit code that works 80-90% of the time, and that will be a maintenance nightmare.
It's actually worse than that, because really the first case is "produce 1x good code". The hard part was never typing the code, it was understanding and making sure the code works. And with LLMs as unreliable as they are, you have to carefully review every line they produce - at which point you didn't save any time over doing it yourself.
Look at the pretty pictures AI generates. That's where we are with code now. Except you have ComfyUI instead of ChatGPT. You can work with precision.
I'm a 500k TC senior SWE. I write six nines, active-active, billion dollar a day systems. I'm no stranger to writing thirty page design documents. These systems can work in my domain just fine.
> Look at the pretty pictures AI generates. That's where we are with code now.
Oh, that is a great analogy. Yes, those pictures are pretty! Until you look closer. Any experienced artist or designer will tell you that they are dogshit and don't have value. Don't look further than at Ubisoft and their Anno 117 game for a proof.
Yep, that's where we are with code now. Pretty - until you look close. Dogshit - if you care to notice details.
Not to mention how hard it is to actually get what you want out of it. The image might be pretty, and kinda sorta what you asked for. But if you need something specific, trying to get AI to generate it is like pulling teeth.
Since we’re apparently measuring capability and knowledge via comp, I made 617k last year. With that silly anecdote out of the way, in my very recent experience (last week), SOTA AI is incapable of writing shell scripts that don’t have glaring errors, and also struggles mightily with RDBMS index design.
Can they produce working code? Of course. Will you need to review it with much more scrutiny to catch errors? Also yes, which makes me question the supposed productivity boost.
The problem is not that it can’t produce good code if you’re steering. The problem is that:
There are multiple people on each team, you can not know how closely each teammate monitored their AI.
Somebody who does not car will vastly outperform your output. By orders of magnitude. With the current unicorn chasing trends, that approach tends to be more rewarded.
This produces an incentive to not actually care about the quality. Which will cause issues down the road.
I quite like using AI. I do monitor what it’s doing when I’m building something that should work for a long time. I also do total blind vibe coded scripts when they will never see production.
But for large programs that will require maintenance for years, these things can be dangerous.
The management has decided that the latter is preferable for short term gains.