It doesn't remove the need for software, but it greatly reduces the number of tools needed or doesn't mandate building custom tools that might not be viable due to very specific needs many users have.
OP gave a good example how their workflow was changed, you could argue there are tools that could've done that, but they managed to achieve their goals without them, have something that fits their workflow perfectly, is fine tuned in case of changes, and with a few other tools (Word, Excel, Figma) they can do all sorts of things which would've required a small team or far more (expensive) tools to execute.
To me that is a great example of non-developers using tools to enhance their workflows and with initiatives like from this topic, I can only see that increasing.
Its funny to think that with a model release Anthropic can slide in some instructions ("be a bit more detailed" or something similar) that affect the token output by a few percent, 5-10%, which will not be noticeable by most users but over the course of the year would bring solid growth (once the VC craze is over, if ever) and increase income.
"Regular companies" would love to have a growth like that without effectively doing anything.
I like how some people are accusing them of reducing the overall token usage to screw over Claude Code users and then there are yet other people that are accusing them of deliberately increasing token usage to screw over API users (or maybe to get subscription users to upgrade, I'm not really sure)
I suspect the real issue is that they just change stuff "randomly" and the experience gets worse/better cheaper/more expensive.
Since you have no way of knowing when they change stuff, you can't really know if they did change something or it's just bias.
I've experienced that so many times in the last month that I switched to codex. The worst part is, it could be entirely in my head. It's so hard to quantify these changes, and the effort it takes isn't worth it to me. I just go by "feeling".
The issue is business and transparency. Transparency is often in the customer's interest at the individual business's expense.
There are very, very few things that can be completely transparent without giving competitors an advantage. The nice solution solution to this is to be better and faster than your competitors, but sometimes it's easier just to remove transparency.
I expect "model transparency" to become the new "SSO" enterprise feature differentiator.
Enterprise use cases have to have it (or else pawn the YOLO off on their users), so it will be a key way to bucket customers into non-enterprise vs enterprise pricing.
They don't even need to do anything. LLMs are effectively random anyway. Even ignoring temperature and inadvertent nondeterminism in inference, the change in outputs from a change in inputs is unpredictable and basically pseudorandom. That's not to say they aren't useful, just that Anthropic could make zero changes and people would still see variations that they'd attribute to malice.
They absolutely do change this all the time - session limits vary wildly. The most damning proof of this is that there's absolutely no information about how many tokens you get per session with each subscription level, it's just terms like 5x, 20x. But 5x what? Who knows?
That's not proof of anything. Also the usage is not solely based on tokens because you also have to factor in things like prompt caching costs (and savings). So it's based on the actual API cost.
Again, it is not based on number of tokens. If it was solely based on number of tokens then things like cache misses would not impact the usage so much. It's based on the actual cost which includes things like the caching costs.
I think this is the case. In the early GPT-4 days I tested the same model side by side across the subscription and API. The API always produced a longer better answer. To me it felt like the API model was working how it was supposed to work while the subscription model tried to reduce its token usage. From a business perspective that would make sense. I then switched to API only because I felt like it was worth the extra cost.
I did a similar test with sonnet about 6 months ago and noticed no difference, except that the subscription was way cheaper than API access. This is not the case anymore, at least not for me. The subscription these days only lasts for a few requests before it hits the usage limit and goes over to ”extra usage” billing. Last week I burned through my entire subscription budget and 80$ worth of extra usage in about 1h. That is not sustainable for me and the reason I started looking at alternatives.
From a business perspective it all makes sense. Anthropic recently gave away a ton of extra usage for free. Now people have balance on their accounts that Anthropic needs to pay for with compute, suddenly they release a model that seem to burn those tokens faster than ever. Last week I felt like the model did the opposite, it was stopping mid implementation and forgetting things after only 2 turns. Based on the responses I got it seemed like they were running out of compute, lobotomized their model and made it think less, give shorter answers etc. Probably they are also doing A/B testing on every change so my experience might be wildly different from someone else.
The UIs all bake in system prompts and other tunable configs that the API leaves open, so does Claude Code and other harnesses. So anything you notice different over the API when you're controlling the client is almost certainly that. Note that this is kind of something they have to do because consumer UI users will do stuff like ask models their name or date, or want it to respond politely and compassionately, and get upset/confused when they just get what's in the weights.
The problem with subscriptions for this kind of stuff is that it's just incompatible with their cost structure. The worst being, subscription usage is going to follow a diurnal usage pattern that overlaps with business/API users, so they're going to have to be offloaded to compute partners who most likely charge by the resource-second. And also, it's a competitive market, anybody who wants usage-based pricing can just get that.
So you basically end up with adverse selection with consumer subscription models. It's just kind of an incoherent business model that only works when your value proposition is more than just compute (which has a usage-based, pretty fungible market)
> In the early GPT-4 days I tested the same model side by side across the subscription and API. The API always produced a longer better answer.
If you are comparing responses in ChatGPT to the API, it's apples and oranges, since one applies a very opinionated system prompt and the other does not.
Since you haven't figured that out in 3 years, I didn't bother reading the rest of your comment.
Protesting against ethnic cleansing is a bad thing, that’s what you’re saying?
No matter what kind of mental gymnastics you try to do, this is just an obvious case of a foreign government having a huge influence and control over internal US affairs.
Mentioning drinking three times (effects of drinks in the evening, hangover, effects on learning) in a single response might give an impression you like to drink.
I mean, they sell alcohol in shops for money, and not force it on you in some government-mandated way. Which kinda tells that people in general like to drink.
Huh, that's much less than I expected! Apparently it's just for adult population (correctly). Apparently that varies from year to year (in 2024 it was 64%), and by race (highest in white people 70%). I also wonder how that looks when we exclude older people (let's say over 60) who more often have health problems or just tolerate alcohol very badly).
It's higher in my country so good to know that USA drinks less than I assumed.
The topic is drinking so they mentioned drinking, and usually people do what they like to do, and in other news water is wet - but do we judge water for being wet? So let's stop virtue signaling because it's definitely not a show of virtue. I see where religious fundamentalism is taking the world and damn if I like it.
I’m not religious at all, and it’s not religious fundamentalism to point out that alcohol disrupts sleep, and that this likely is the primary factor affecting the poster’s sleep.
Also not religious fundamentalism to point out that alcohol is a known carcinogen (: that’s just science. It’s a Group 1 carcinogen, the same group as tobacco and asbestos.
But sunlight is essential for the cutaneous conversion of 7-dehydrocholesterol to vitamin D3, whereas ethanol serves no essential purpose, irrespective of whether one enjoys it or not.
Personally I don’t consume ethanol; but I don’t care if others do or not so long as they stay off the roads and are not piloting my flight.
I will say that when I did consume ethanol even in small quantities, my sleep was much worse than it is at baseline; and that effect only worsened as I got older.
Ethanol is food— you also don't need carbs, but they do keep you alive. More importantly, it fairly central to cultural vitality. Not essential but it plays a highly functional role. Maybe could be replaced by religion or other drugs, but short of that, the world is less vibrant without it.
Unrelated to any religion, any amount of alcohol is harmful for health[1], science has recently shed a lot light on this. The OP mentioned drinking and was surprised to see someone noticing they mentioned drinking three times in their post.
You mean codex (client) with GPT 5.4 xhigh? I am using Codex 5.3 (model) through Cursor, waiting for Codex 5.4 model as I had great experience so far with 5.3.
But certainly indirectly with cash. All the advertised products are more expensive than they could be, due to the costs of advertising. This comes out of everyone's pocket.
Exactly. Deciding on some very expensive subscriptions that can cost 1k per month or so might be worth thinking about, but this is just meaningless optimisation.
not at all meaningless. unless you have money to invest, at the beginning you don't have an income. i could not afford to spend $300 a month to host a new product that doesn't make any money yet. i can afford the $20 however, but then once the product does make money, why should i change it if it works?
that's a strange argument. living in a dorm doesn't work the same way than living on my own. so actually if i like it, yeah, why would i change it until i need to? i would change it to have more privacy or to be closer to my new workplace. those are qualitative differences that matter.
moving from self hosting to a hosted service gives me other advantages. whether those matter depends on the specific situation. the reality is that in many cases the hosting costs rise faster than the costs for self hosting would, and big companies save lot of money by moving back.
switching from $20 a month to $300 a month is only worth it if i save at least a day of work per month. but that's often not the case. and the more you scale up the less work per revenue/profit you have.
the one big advantage of paid services is that they lets you scale up faster. if that is what you need, then sure. OP does not scale fast.
OP gave a good example how their workflow was changed, you could argue there are tools that could've done that, but they managed to achieve their goals without them, have something that fits their workflow perfectly, is fine tuned in case of changes, and with a few other tools (Word, Excel, Figma) they can do all sorts of things which would've required a small team or far more (expensive) tools to execute.
To me that is a great example of non-developers using tools to enhance their workflows and with initiatives like from this topic, I can only see that increasing.
reply