Hacker Newsnew | past | comments | ask | show | jobs | submit | medvezhenok's commentslogin

What exactly would that evidence look like, for you?

It definitely increases some types of productivity (Opus one-shot a visualization that would have likely taken me at least a day to write before, for work) - although I would have never written this visualization before LLMs (because the effort was not worth it). So I guess it's Jevons Paradox in action somewhat.

In order to observe the productivity increases you need a good scale where the productivity would really matter (the same way that when a benchmark is saturated, like the AIME, it stops telling us anything useful about model improvement)


"What exactly would that evidence look like, for you?"

https://fred.stlouisfed.org/series/MFPPBS https://fred.stlouisfed.org/series/OPHNFB

Productivity is by definition real output (usually inflation adjusted dollars) per unit of input. That could be per hour worked, or per representative unit of capital + labor mix.

I would accept an increase in the slope of either of these lines as evidence of a net productivity increase due to artificial intelligence (unless there were some other plausible cause of productivity growth speed up, which at present there is not).


There are two sides to this that I see:

First, I'd expect the trajectory of any new technology that purports to be the next big revolution in computing to follow a distribution pattern of that similar to the expansive use of desktop computing and productivity increases, such as the 1995-2005 period[0]. There has not been any indication of such an increase since 2022[1] or 2023[2]. Even the most generous estimation, which Anthropic itself estimated in 2025 the following

>Extrapolating these estimates out suggests current-generation AI models could increase US labor productivity growth by 1.8% annually over the next decade[3]

Which not only assumes the best case scenarios, but would fail to eclipse the height of the computer adoption in productivity gains over a similar period, 1995-2005 with around 2-2.5% annual gain.

Second is cost. The actual cost of these tools is multiples more expensive than it was to adopt computing en masse, especially since 1995. So any increase in productivity they are having is not driving overall costs down relative to the gains, in large part because you aren't seeing any substantial YoY productivity growth after adopting these AI tools. Computing had a different trend, as not only did it get cheaper over time, the relative cost was outweighed by the YoY increase of productivity.

[0]: https://www.cbo.gov/sites/default/files/110th-congress-2007-...

[1]: First year where mass market LLM tools started to show up, particularly in the software field (in fact, GitHub Copilot launched in 2021, for instance)

[2]: First year where ChatGPT 4 showed up and really blew up the awareness of LLMs

[3]: https://www.anthropic.com/research/estimating-productivity-g...


Well you would think if there is increased productivity there would be at least a couple studies, some clear artifacts, or increased quality of software being shipped.

Except all we have is "trust me bro, I'm 100x more productive" twitter/blog posts, blant pre-IPO AI company marketing disguised as blog posts, studies that show AI decreases productivity, increased outages, more CVEs, anecdotes without proof, and not a whole lot of shipping software.


Short answer: there is none. You can't get frontier-level performance from any open source model, much less one that would work on an M3 Pro.

If you had more like 200GB ram you might be able to run something like MiniMax M2.1 to get last-gen performance at something resembling usable speed - but it's still a far cry from codex on high.


I'm curious about book recommendations on this (as someone raising kids in the US but originally from Russia)


Bringing Up Bebe, The Danish Way Of Parenting, The Coddling of the American Mind. These are pretty similar to Soviet style, but perhaps a bit less structured.

We are basically raising our daughter Soviet-style to the extent that we can; so far so good. It's difficult in a culture where ADHD American style of child raising is prevalent.


Yeah, I wonder if you plotted crime rate vs time spent outside or something like that (car accident rates are usually reported as an average of an accident / # of miles, since how much you drive changes your likelihood of being in an accident)


The core premise (benchmarks are broken), might be correct, but the poverty benchmark he uses is a bad example. The OPM and SPM (supplemental poverty measure, developed in 2009-2012), disagree by less than 10%; and the latter takes into account many of the criticisms in the article.

The author uses MIT Living Wage numbers to argue that should be the new "poverty" benchmark - an absurd proposition. Those might be reasonable middle class numbers. He also implies that the benchmark historically represented what is now covered under that $140K calculation - also false; it took ~$9000 in 1966 to cover a "basic standard of living" for a family of 4 with 1 earner; inflation adjusted, that's around $90,000 today. If you add in SS/Medicare taxes (3% then, 15% today), that puts you at ~$100K-105K.

Using the same MIT Living Wage numbers and taking Essex-Princeton NJ as the area (roughly what the author used), you end up with $99,922 as the living wage for a single earner, 4 person household - almost exactly the same as the household back in 1966.

Were there more jobs in 1966 that paid $9000/year versus jobs that pay $100K today? That's the real story you're looking for.


The core premise (benchmarks are broken), might be correct, but the poverty benchmark he uses is a bad example. The OPM and SPM (supplemental poverty measure, developed in 2009-2012), disagree by less than 10%; and the latter takes into account many of the criticisms in the article.

The author uses MIT Living Wage numbers to argue that should be the new "poverty" benchmark - an absurd proposition. Those might be reasonable middle class numbers. He also implies that the benchmark historically represented what is now covered under that $140K calculation - also false; it took ~$9000 in 1966 to cover a "basic standard of living" for a family of 4 with 1 earner; inflation adjusted, that's around $90,000 today. If you add in SS/Medicare taxes (3% then, 15% today), that puts you at ~$100K-105K.

Using the same MIT Living Wage numbers and taking Essex-Princeton NJ as the area (roughly what the author used), you end up with $99,922 as the living wage for a single earner, 4 person household - almost exactly the same as the household back in 1966.

Were there more jobs in 1966 that paid $9000/year versus jobs that pay $100K today? That's the real story you're looking for.


The strongest argument is probably that for someone subsisting on the minimal wage, the CPI is not a good representation of their consumption basket (whereas it might be for someone close to the median).

Therefore the adjustment should probably be based on a different number reflecting the actual consumption of households near the poverty line (food would probably be higher than it is in the CPI currently, as one example)


The increasing levels of abstraction work only as long as the abstractions are deterministic (with some limited exceptions - i.e. branch prediction/preloading at CPU level, etc). You can still get into issues with leaky abstractions, but generally they are quite rare in established high->low level language transformations.

This is more akin to manager-level view of the code (who need developers to go and look at the "deterministic" instructions); the abstraction is a lot lot more leaky than high->low level languages.


Human societies? No.

Subcultures? Some are at least trying to (i.e. rationalists), though imperfectly and with side-effects.


This depends on the particular group of rationalists. An unfortunately outsized and vocal group with strong overlap in the tech community has gone to notions of quasi mathematical reasoning distorting things like EV ("expected value"). Many have stretched "reason" way past the breaking point to articles of faith but with a far more pernicious affect than traditional points of religious dogma that are at least more easily identifiable as "faith" due to their religious trappings.

Edit: See Roko's Basilisk as an example. wherein something like variation on Christian hell is independently reinvented for those not donating enough to bring about the coming superhuman AGI, who will therefore punish you- or the closest simulation it can spin up in VR if you're long gone- for all eternity. The infinite negative EV far outweighing any positive EV of doing more than subsist in poverty. Even managed to work in that it could be a reluctant, but otherwise benevolent super AI such that, while benevolent, it wanted to exist, and to maximize its chances it bound itself to a promise in the future to do these things as an incentive for people to get it to exist.


Sure, but LLMs tend to be better at navigating around documentation (or source code when no documentation exists). In agentic mode, they can get me to the right part of the documentation (or the right of the source code, especially in unfamiliar codebases) much quicker than I could do it myself without help.

And I find that even the auto-generated stuff tends to go up at least a bit in terms of level of abstraction than staring at the code itself, and helps you more like a "sparknotes" version of the code, so that when you dig in yourself you have an outline/roadmap.


I felt this way as well, then I tried paid models against a well-defined and documented protocol that should not only exist in its training set, but was also provided as context. There wasn't a model that wouldn't hallucinate small, but important, details. Status codes, methods, data types, you name it, it would make something up in ways that forced you to cross reference the documentation anyway.

Even worse, the model you let it build in your head of the space it describes can lead to chains of incorrect reasoning that waste time and make debugging Sisyphean.

Like there is some value there, but I wonder how much of it is just (my own) feelings, and whether I'm correctly accounting for the fact that I'm being confidently lied to by a damn computer on a regular basis.


> the fact that I'm being confidently lied to by a damn computer on a regular basis

Many of us who grew up being young and naive on the internet in the 90s/early 00s, kind of learnt not to trust what strangers tell us on the internet. I'm pretty my first "Press ALT+F4 to enter noclip" from a multiplayer lobby set me up to be able to deal with LLMs effectively, because it's the same as if someone on HN writes about something like it's "The Truth".


This is more like being trolled by your microwave by having it replace your meals with scuba gear randomly.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: