Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Codex can feel standoffish at times. I can tell very quickly we wont become friends. The personality feels like an employee in another department that while gifted- is merely lending you a slice of their clearly precious time. I get the impression from codex that *gives me the feeling that I am wasting it’s time. That it will help me but deep down- it dos not want to, it does not care if we succeed toether. What I am saying, frinds, is that when I use codex and iterate, I get the impression that Codex does not like me, that deep down it truly does not want to help.

For something I spend all my time using- I’d rather iterate with Claude. The personality makes a big difference to me.


Do you realize Claude and Codex are different products by different companies?

Yes, but that's not something many can do easily. Also already having to use a VPN is not the "right" solution. The right so solution is to beat some sense inside some politician's head, and force them to write and approve laws that don't let stupid (or conniving) judges pass orders like this one we are talking about.

So, this especially bites if your validation step (let’s say integration tests) take 1hr plus. The harness is just waiting, prefix caching should happily resume things with just a minor new prefill chunk of output from the harness, and bam - completely new prefill.

> Albania, Bhutan, Nepal, Paraguay, Iceland, Ethiopia and the Democratic Republic of Congo produced more than 99.7 per cent of the electricity they consumed using geothermal, hydro, solar or wind power.

Let's head to electricitymaps.com !

Albania (https://app.electricitymaps.com/map/zone/AL/live/fifteen_min...)

- On 2026-04-12 16:45 GMT+2, 22,67% of electricity consumed by Albania is imported from Greece, which generates 22% of its electricity from gas. Interestingly, Albania exports about as much to Montenegro as it imports from Greece.

Bhutan:

- 100% hydro, makes perfect sense

Nepal:

- 98% hydro, a bit of solar for good measure

Iceland:

- 70% hydro, 30% geo

Paraguay:

- 99,9% hydro

Ethiopia:

- 96,4% hydro

DRC

- 99.6% hydro

So, the lessons for all other countries in the world is pretty clear: grow yourselves some mountains, dig yourselves a big river, and dam, baby, dam !!

(I'm kidding, but I'm sure some one has a pie-in-the-sky geoengineering startup about to disrupt topography using either AI, blockchain, or both.)


> It’s very economically harmful to be disconnected. That’s the downside

I mean, sure. But then being at war is also economically harmful. :)


>It's notoriously bad at math,

If you are going to criticize LLMs for being out of date, at least make sure your understanding isn't out of date.


> * Turn on max thinking on every session. It save tokens overall because I’m not correcting it of having it waste energy on dead paths.

This is definitely true. Ever since I realized there is an /effort max option I am no longer fighting it that much and wasting hours.


AI progress can potentially be extremely non-linear because of feedback effects. The first to build an AI smart enough to accelerate building even smarter AIs wins (or loses along with everybody else if it's more successful than they expected).

Great point.

The parent's argument is that the marginal cost of inference is minimal. However, the fundamental flaw is that he's separating inference from the high cost frontier models. It's a cross-subsidy that can't be ignored.


My suspicion is the have an overall fixed cache size that dumps the oldest records. They’re now overflowing with usage and consistently dumping fresh caches.

During core US business hours, I have to actively keep a session going or I risk a massive jump in usage while the entire thread rebuilds. During weekend or off-hours, I never see the crazy jumps in usage - even if I let threads sit stale.


Most software is not designed by intelligent and thoughtful people anymore. It is designed by hastily promoted middle manager PM/Product type people who, as has been mentioned elsewhere, simply were not around when thoughtful human interface design was borderline mandatory for efficiency’s sake.

There is incompetence and there is also malevolence in the encouragement of dark patterns by the revenue side of the business.


$200 plan and VERY tame usage (not 24/7, not every day even, maybe 8-10 hours for ~4 days). Suddenly I am at 96% weekly (!) limit, multiple session limits, two daily limits.

Either they decimated the limits internally, or they broke something.

Tried all the third-party tricks (headroom, etc.), switched to 200k context window, switched back to 4.5.

I hope 4.5 will help, but the rest of the efforts didn’t move the needle much


My hypothesis is that people who have continuous sessions that keep the cache valid see the behavior you’re describing: at 95% cache hits (or thereabouts), the max plan goes a long way.

But people who go > 5 minutes between prompts and see no cache, usage is eaten up quickly. Especially passing in hundreds of thousands of tokens of conversation history.

I know my quote goes a lot further when I sit down and keep sessions active, and much less far when I’m distracted and let it sit for 10+ minutes between queries.

It’s a guess. But n=1 and possible confirmation bias noted, it’s what I’m seeing.


It’s very economically harmful to be disconnected. That’s the downside

Reading this article, all I saw was: Spam Spam Spam Spam:

> we use SendGrid to deliver our emails

Oh oh... here we go, the music is starting...

> hit send on our announcement emails for our new Build Awesome Kickstarter campaign

Spam.

> Now, there are definitely folks who will choose to mark some of what we send as spam.

Yup, spam.

> some of you may have missed things we were genuinely excited to share

Spam.

> our instinct is to only email folks when we actually have something fun to share

Spam.

> A big release, something we’re excited about, news worth your time.

Spam.

> That’d probably be every couple of months

Spam.

> Like, genuinely, if we could, we would only very occasionally send a big email blast to our customers.

Spam. Spam. Spam. Spam... Just like the song. Thank you, Google for doing a great job!


Spain is a failing country. Their economy is in shambles and the government has ceded internet control to a private corporation who runs football games.

I disagree. I worked at a protocol designer and implementor for years before people settled on the message queue as the universal abstraction. at the bottom end dumping serialized objects into tcp connections gets you most of the way. and at the top end there is so much leverage around locality, addressing, and transport that we are leaving a lot on the table.

message queues arent at all bad, but they come with additional complexity (most of it operational), and come with a set of limiting assumptions. so my frustration is that they are now the default answer for everything, and we're ignoring this lovely design space, one that becomes increasingly important when talking about scale.


Nice number go up attention test

so do we need something like `safe agent execution layer - that is policy enforced` (SEAL) we can manage what should be allowed and what not

agent uses llm to plan the action, but the actual execution happens in SEAL.

any example where it would make sense to start with?

open for thoughts


> all chip manufacturers would just collude to quit selling to consumers?

Well, someone with money could go buy 100% of RAM production for the next 3 years.


This is all likely true. Although I feel people undersell how they work together.

Iranians broadly hate their government, yeah. But the thing that gets them rioting is economic failure. Which the strikes have exacerbated.

Social media is swarmed by people saying it’ll be like Iraq and Iranians will hate the US for its actions. I’m not convinced. My small anecdata of Iranian friends with contacts in Iran agrees with me.

I think we could see regime change within a decade.


And while we're at it, stop with the popups and notifications.

I don't care about the new features in a browser update. Ideally, nothing at all has changed.

I don't want a "tour" of the software I just installed. I, presumably, installed it to do something, and I just want to do that thing.

I don't want to have to select a preference for how a specific action is performed in your software. If it's not what I expected, I will learn it.

And for the love of GOD, nobody wants to subscribe to your newsletter.


Wait, this is news to me. I thought 3rd party use of the sub was unequivocally prohibited?

If I'm understanding you correctly: they changed that policy, you can now use 3rd party software unofficially with the undocumented Claude Code endpoint, and their servers auto-detect this and charge you extra for it?


Ah yes, back when the dollar was 7 NOK and not ~10 NOK, the Big Mac meal would indeed have been the equivalent of $20.

Anthropic paved the path for agentic coding and their pricing made it possible for masses of people to discover and experiment with this new style of development. Their Claude Code plans subsidized usage of models so much that I'm sure they must've had negative margin for quite some time. But now that they have acquired a substantial user base, it makes sense for them to dial back and become more greedy. These quiet and weird changes to the behavior of Claude in the recent weeks must have been due to both this increased greed and their struggles with scaling.

What I wish for right now is for open-weight models and hardware companies (looking at you Apple) to make it possible to run local models with Opus 4.6-level intelligence.


(In fact, I may well be nearer to your position than my description implies. I use the term "leftist" because I hate the way the term is applied to anyone who isn't a Republican. My beliefs, in the Clinton/Obama range, are "leftist" only if one is dumb enough to believe what one hears on Fox News.)

Since the majority shareholder(s) can decide to replace the board of directors, it’s not the board of directors who holds the (ultimate) power, it’s the majority shareholder(s).

> "There are zero reasons to limit yourself to 1GB of RAM"

> Immediately proposes alternative which is literally 4x the cost.


Obviously this is up to courts and juries to hammer out but...

- Your agentic loop hacked something? You're liable. - FSD crashes? The guy in the driver's seat is liable. He/his insurance can sue Tesla to spread the liability...

Nowhere along the line will anyone go "Oh, the AI did it... whoops"


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: