Hacker Newsnew | past | comments | ask | show | jobs | submit | sweezyjeezy's commentslogin

The model should show some facsimile of understanding that it should not ignore the stop hook, otherwise that is a regression. Does that wording make you happier?

They said it doesn’t “understand” anything with which to give a real answer, so there’s no point in asking. You said “yeah but it should at least emulate the words of something that understands, that way I can pay a nickel for some apology tokens.” That about right?

I mean at some point what difference does this make? We can split hairs about whether it 'really understands' the thing, and maybe that's an interesting side-topic to talk about on these forums, but the behavior and outputs of the model is what really matters to everyone else right?

Maybe it doesn't 'understand' in the experiential, qualia way that a human does. Sure. But it's still a valid and useful simile to use with these models because they emulate something close enough to understanding; so much so now that when they stop doing it, that's the point of conversation, not the other way around.


When people talk about an LLM “not understanding” you’re apparently taking it to be similar to someone saying a fish doesn’t “understand” the concept of captivity, or a dog doesn’t “understand” playing fetch. Like the person is somehow narrowly defining it based on their own belief system and, like, dude, what is consciousness anyway?

That’s not what’s happening. When it’s said that an LLM doesn’t understand it’s meant in the “calculator doesn’t understand taxes” or “pachinko machine doesn’t understand probability” way. The conversation itself is silly.


Deep learning works at a very high level because 'it can keep learning from more data' better than any other approaches. But without the 'stupid amount of data' that is available now, the architecture would be kind of irrelevant. Unless you are going some way to explain both sides of the model-data equation I don't feel you have a solid basis to build a scientific theory, e.g. 'why reasoning models can reason'. The model is the product of both the architecture and training data.

My fear is that this is as hopeless right now as explaining why humans or other animals can learn certain things from their huge amount of input data. We'll gain better empirical understanding, but it won't ever be fundamental computer science again, because the giga-datasets are the fundamental complexity not the architecture.


Ironic that you are talking about unrepresentative samples while characterizing Reddit users that way. Reddit is a huge subsection of the internet, over a billion monthly active users. It covers basically all strata of western society you can think of.

They seem to count unregistered users in the D/W/MAU, but even so the producer / consumer of content ratio across social media still stands. What you read is a subsection of all visitors to Reddit, which follows the power law.

Are you counting the fakes and the bots?

Moreover, as sibling pointed out, content creator != visitor.

If Reddit was a representative sample of reality, all governments in the West would have been ran by gay race Communists.


I have to ask - are you a bot? You seem to be making some very strong and misinformed statements here. You can fixate on the number if you like, but calling Reddit homogeneously radical left or 'very small', is pretty stupid.

Thinking that Reddit has any resemblance to reality is what is stupid.

That's a false-dichotomy. Capitalism was good for artisanal workers before the industrial revolution, and then it became pretty goddamn bad for them. We're worried we're staring down the barrel of that right now - just saying 'well it was even worse before capitalism' does nothing for us.

yes it does, it says that trying to prevent technology in order to protect the interests of some special class up people at the expense of everyone else is dumb and shortsighted.

If if people actually listened to the people wailing "but what about the horse carriage business!!!" in the 20th century, it would have been a disaster.


Sure, but AI pessimism is allowed to be personal. Am I supposed to be optimistic that I feel I'm about to get shafted? Should I be less concerned that I need to provide for my family, because in the long term this is going to be a great step forward for humanity?

youre allowed to be personally pessimistic, but if you actually do anything to prevent it from happening i think that is incredibly selfish.

it would be like an oncologist trying to stop an anti-cancer pill because theyd be out of a job.


You are addressing something totally different to the original claim - which tried to say that capitalism is inherently exploitative on labour which is just outdated Marxism

To be frank, I thought trying to twist this into an argument about whether capitalism is inherently exploitative was a complete waste of time and I replied as such. If you'll recall what we were originally talking about here - "AI, should HN users be optimistic?"

That's a good idea and FWIW I agree that as a person who might lose their job to AI, you do deserve to feel apprehensive, even if it might lead to some good later.

Well this is HN so a lot of us are pretty terrified of your 1). We went from 'you have a good job for the next couple of decades' to 'your job is at extreme risk for disruption from AI' in the space of like 5 years. Personally I have a family, I'm a bit old to retrain, but I never worked at a high-comp FAANG or anything so I can't just focus on painting unless my government helps me (note - not US/China). That's extremely anxiety-inducing, that a vague promise of novel new things does not come close to compensating.

I'm 33 and I feel sort of lucky that I'll still potentially have time to retrain. I'm fully prepared to within the next 5 years or so (and potentially much less) I'll probably need to retrain into a trade or something to stay relevant in any sort of field.

Many people claim its going to become a tool we use alongside our daily work, but its clear to me thats not how anybody managing a company sees it, and even these AI labs that previously tried to emphasize how much its going to augment existing workforces are pushing being able to do more with less.

Most companies are holding onto their workforce only begrudgingly while the tools advance and they still need humans for "something", not because they're doing us some sort of favor.

The way I see it unless you have specialized knowledge, you are at risk of replacement within the next few years.


> I'm 33 and I feel sort of lucky that I'll still potentially have time to retrain. I'm fully prepared to within the next 5 years or so (and potentially much less) I'll probably need to retrain into a trade or something to stay relevant in any sort of field.

The problem is that there are not many fields that are going to be immune to AI based cost cutting and there surely will not be enough work for all of us even if we all retrain.

If we all do, then it will create a n absolutely massive downward pressure on wages due to massive oversupply in other lines of work too

So there's really just no good way out


I also have contemplated just retraining now to try and get ahead of the curve, but I'm not confident that trades can absorb the shock of this - both in terms of supply (more unemployment) and demand (anything non-commercial will be hit by capital flight on the customer-side). I figure I will just try and make as much money on a higher wage as I can and hope for the best...

> But the entire value is that it can be automated. If you try to automate a small model to look for vulnerabilities over 10,000 files, it's going to say there are 9,500 vulns. Or none.

'Or none' is ruled out since it found the same vulnerability - I agree that there is a question on precision on the smaller model, but barring further analysis it just feels like '9500' is pure vibes from yourself? Also (out of interest) did Anthropic post their false-positive rate?

The smaller model is clearly the more automatable one IMO if it has comparable precision, since it's just so much cheaper - you could even run it multiple times for consensus.


Admittedly just vibes from me, having pointed small models at code and asked them questions, no extensive evaluation process or anything. For instance, I recall models thinking that every single use of `eval` in javascript is a security vulnerability, even something obviously benign like `eval("1 + 1")`. But then I'm only posting comments on HN, I'm not the one writing an authoritative thinkpiece saying Mythos actually isn't a big deal :-)


My proof-in-pudding test is still the fact that we haven't seen gigantic mass firings at tech companies, nor a massive acceleration on quality or breadth (not quantity!) of development.

Microsoft has been going heavy on AI for 1y+ now. But then they replace their cruddy native Windows Copilot application with an Electron one. If tests and dev only has marginal cost now, why aren't they going all in on writing extremely performant, almost completely bug-free native applications everywhere?

And this repeats itself across all big tech or AI hype companies. They all have these supposed earth-shattering gains in productivity but then.. there hasn't been anything to show for that in years? Despite that whole subsect of tech plus big tech dropping trillions of dollars on it?

And then there is also the really uncomfortable question for all tech CEOs and managers: LLMs are better at 'fuzzy' things like writing specs or documentation than they are at writing code. And LLMs are supposedly godlike. Leadership is a fuzzy thing. At some point the chickens will come to roost and tech companies with LLM CEOs / managers and human developers or even completely LLM'd will outperform human-led / managed companies. The capital class will jeer about that for a while, but the cost for tokens will continue to drop to near zero. At that point, they're out of leverage too.


Leadership is also a very human thing. I think most people would balk at the idea of being led by an LLM.

One of the main functions of leaders (should be) is to assume responsibility for decisions and outcomes. A computer cant do that.

And finally why should someone in power choose to replace themselves?


>One of the main functions of leaders (should be) is to assume responsibility for decisions and outcomes. A computer cant do that.

Sure it can. "Assuming responsibility" just means people/the law lets you to.

It can be totally empty too, like CEOs or politicians "assuming responsibility" for some outcome but nevertheless suffering zero conseuences.


Someone in power doesn’t get to choose - the board of directors do. Who’s job is to act in the best interest of shareholders.

Firms tend to follow peers in an industry - once one blinks the rest follow.


The board of directors are also people in power - why not replace them with an LLM as well if it works so well for CEOs?


> Someone in power doesn’t get to choose - the board of directors do. Who’s job is to act in the best interest of shareholders.

Alas, shareholder value is a great ideal, but it tends to be honoured in practice rather less strictly.

As you can also see when sudden competition leads to rounds of efficiency improvements, cost cutting and product enhancements: even without competition, a penny saved is a penny earned for shareholders. But only when fierce competition threatens to put managers' jobs at risk, do they really kick into overdrive.


>shareholder value is a great ideal

It's one of the most horrible ideas ever, responsible for anything from market abuse and enshittification to rent seeking and patent trolling.


> Someone in power doesn’t get to choose - the board of directors do

Since the board of directors can decide to replace the CEO, it's not the CEO who holds the (ultimate) power, it's the board of directors.


Since the majority shareholder(s) can decide to replace the board of directors, it’s not the board of directors who holds the (ultimate) power, it’s the majority shareholder(s).


Indeed, and there we reached the end of the chain.

Your proof-in-pudding test seems to assume that AI is binary -- either it accelerates everyone's development 100x ("let's rewrite every app into bug-free native applications") or nothing ("there hasn't been anything to show for that in years"). I posit reality is somewhere in between the two.


Considering that "AI will replace nearly all devs" and "AI will give 100x boost" and such we were promised, it makes sense to question this.

After almost all hyped technology is also "somewere between the two" extremes of not doing what it promises at all and doing it. The question is which edge it's closer to.


LLM’s are capable of searching information spaces and generating some outputs that one can use to do their job.

But it’s not taking anyone’s job, ever. People are not bots, a lot of the work they do is tacit and goes well beyond the capabilities and abilities of llm’s.

Many tech firms are essentially mature and are currently using too much labour. This will lead to a natural cycle of lay offs if they cannot figure out projects to allocate the surplus labour. This is normal and healthy - only a deluded economist believes in ‘perfect’ stuff.


"it’s not taking anyone’s job, ever"

It has already and that doesn't mean new jobs haven't been created or that those new jobs went to those who lost their jobs.


In this entire thread of conversation, I never said that LLMs would take people's jobs, and that is not something I believe.


> LLMs are better at 'fuzzy' things like writing specs or documentation than they are at writing code.

At least for writing specs, this is clearly not true. I am a startup founder/engineer who has written a lot of code, but I've written less and less code over the last couple of years and very little now. Even much of the code review can be delegated to frontier models now (if you know which ones to use for which purpose).

I still need to guide the models to write and revise specs a great deal. Current frontier LLMs are great at verifiable things (quite obvious to those who know how they're trained), including finding most bugs. They are still much less competent than expert humans at understanding many 'softer' aspects of business and user requirements.


> Microsoft has been going heavy on AI for 1y+ now. But then they replace their cruddy native Windows Copilot application with an Electron one.

This.

Also, Microsoft is going heavy on AI but it's primarily chatbot gimmicks they call copilot agents, and they need to deeply integrate it with all their business products and have customers grant access to all their communications and business data to give something for the chatbot to work with. They go on and on in their AI your with their example on how a company can work on agents alone, and they tell everyone their job is obsoleted by agents, but they don't seem to dogfood any of their products.


> My proof-in-pudding test is still the fact that we haven't seen gigantic mass firings at tech companies

This assumes that companies will announce such mass firings (yeah, I'm aware of WARN Act); when in reality they will steadily let go of people for various reasons (including "performance").

From my (tech heavy) social circle, I have noticed an uptick in the number of people suddenly becoming unemployed.


> My proof-in-pudding test is still the fact that we haven't seen gigantic mass firings at tech companies

Jevon's paradox.


For Jevons paradox to be a win-win, you need these 3 statements to be true:

1)Workers get more productive thanks to AI.

2)Higher worker productivity translates into lower prices.

3)Most importantly, consumer demand needs to explode in reaction to lower prices. And we're finding out in real-time that the demand is inelastic.

Around 1900, 40% of American workers worked in agriculture. Today, it's < 2%.

Which is similar to what we see with coding: The increase in demand has not exploded enough to offset the job-killing of each farmer being able to produce more food.


What's a situation where one needs to use `eval` in benign way in JS? If something is precomputable (e.g. `eval("1 + 1")` can just be replaced by 2), then it should be precomputed. If it's not precomputable then it's dependent on input and thus hardly benign -- you'll need to carefully verify that the inputs are properly sanitized.


With LLMs (and colleagues) it might be a legitimate problem since they would load that eval into context and maybe decide it’s an acceptable paradigm in your codebase.


I remember a study from a while back that found something like "50% of 2nd graders think that french fries are made out of meat instead of potatoes. Methodology: we asked kids if french fries were meat or potatoes."

Everyone was going around acting like this meant 50% of 2nd graders were stupid with terrible parents. (Or, conversely, that 50% of 2nd graders were geniuses for "knowing" it was potatoes at all)

But I think that was the wrong conclusion.

The right conclusion was that all the kids guessed and they had a 50% chance of getting it right.

And I think there is probably an element of this going on with the small models vs big models dichotomy.


I think it also points to the problem of implicit assumptions. Fish is meat, right? Except for historical reasons, the grocery store's marketing says "Fish & Meat."

And then there's nut meats. Coconut meat. All the kinds of meat from before meat meant the stuff in animals. The meat of the problem. Meat and potatoes issues.

If you asked that question before I'd picked up those implicit assumptions, or if I never did, I would have to guess.


I’ve got many catholic relatives that describe themselves as vegetarians and eat fish. Language can be surprisingly imprecise and dependent upon tons of assumptions.


> I’ve got many catholic relatives that describe themselves as vegetarians and eat fish

Those are pescatarians.

It's like how a tomato is a fruit, but it's used as a vegetable, meat has traditionally been the flesh of warm-blooded animals. Fish is the flesh of cold-blooded animals, making it meat but due to religious reasons it’s not considered meat.


Right exactly. The point is that dictionary definitions don’t always align with cultural ones.


> 'Or none' is ruled out since it found the same vulnerability

It's not, though. It wasn't asked to find vulnerabilities over 10,000 files - it was asked to find a vulnerability in the one particular place in which the researchers knew there was a vulnerability. That's not proof that it would have found the vulnerability if it had been given a much larger surface area to search.


I don't think the LLM was asked to check 10,000 files given these models' context windows. I suspect they went file by file too.

That's kind of the point - I think there's three scenarios here

a) this just the first time an LLM has done such a thorough minesweeping b) previous versions of Claude did not detect this bug (seems the least likely) c) Anthropic have done this several times, but the false positive rate was so high that they never checked it properly

Between a) and c) I don't have a high confidence either way to be honest.


Mythos was also asked to find a vulnerability in one file, in turn for each file. Maybe the small model needs to be asked about each function instead of each file. Okay, you can still automate that.

or run multiple cheap models in parallel: MOE^n, in effect.

I think it's completely normal. Whenever automation comes knocking, people are inclined to think it's going to flatline conveniently before their job is at risk. LLMs can code now? Cool, they can't code well though can they? Oh they can code pretty well now? Cool, coding was never the hard part of SWE anyway, it's [thing we have no reason to think AI can't beat 99% of humans at at some point], etc

I think SWE as a mainstream profession is much nearer to the end than the beginning, I'm curious and quite scared about what becomes of us.


The problem is that software development contains domain independent and domain specific skills. Since information processing is domain independent, replacing software developers in general will require beating them not only in the domain independent skills, which is what the recent breakthroughs have been about, but also in every single domain dependent skill.

This makes software development AGI-complete. If you have an LLM that can write software for every domain, then for every task you assign it, it could build software that performs the assigned task and thereby solves every problem in existence.

What I'm trying to get at here is that an "SWE" is a biological machine building machine. If you have a digital machine that can build any machine, you haven't solved the first step, you've solved the final step in all of human history that ever needs to be done, whatever that means. Beyond that point, human work no longer exists, because the machines have taken over everything.


I don't think you understand. Frankly, AI is a failure if all it does is replace coders. AI needs (given its current investment levels) to conquer all forms of knowledge work. This is an example of tech/industry needing to impose itself on society, rather than society needing it.


That's how human progress works. No one can want or need it because they cannot conceptualize wanting it until someone shows that it is possible. Now, many of those wants become needs.


We can absolutely conceptualize what we want or need. I was born in 1980 in NYC. When I was a boy my father took me to a tech conference where they had a demo of ordering TV shows on demand. It was a miracle, to my young mind. Was this what I needed?

Growing up I had a friend group of misfit boys, who discovered h4ck1ng and phr34king. But we also discovered slackware Linux on 3.5" floppies. We also had to discover ASM and compiling the linux kernel in order to do anything with it. Boys with machines. That wasn't what I needed either.

Later on we did have great things with tech. Google made the world searchable in ways Altavista didn't. I remember strapping the original iPod on my arm to go for runs outside. I didn't even need a car for a while investors subsidized my Uber rides to and from the office.

Now, it seems the US is balanced on a precipice. The economy seems to have an incredible amount of money desperate to grow, but to what purpose. In my lifetime, and in my parents, and their parents before them, when the dollar becomes restless the flag goes forth. The dollar follows the flag.

And here we are at war.


You wouldn't have known about a TV had you not seen it. That is what I mean by, people generally can't conceptualize what they want or need until they see it.


Wants and needs are not the same. We are experiencing the difference in real time. AI does not give society a want or need.


My point was not about the difference, it was about the fact that average people cannot conceptualize new ideas until one person or team invents it, then the average person will want or need it.

As for AI, I and many others want it, and some even need it, in certain use cases. Speak for yourself.


I believe the idea that you (or I) might know better than the 'average people' to be incredibly conceited, arrogant, and frankly wrong. It is an attitude that gives you superiority for having achieved nothing.


I'm not sure what you're even talking about, you're putting words and an argument into my mouth which I never said.


Well then I owe you an apology. Perhaps I inferred too much about your point of view and understood too little, which is my own loss. Sorry.


I think your numbers are off. TAM for office workers is ~20T a year, of which SWE compensation is ~3T. So if they can make 3T x 10% X 5 years = 1.5T that covers their current valuations. It's not as insane as you make out, even not taking into account the other high risk areas like legal, accounting etc


Hit the nail on the head with that framing. So many articles are now coming out addressing the anxieties about adoption of a new technology, but we genuinely don’t really need it as a society.

I still wonder if we really needed the iPhone or many other things we’re told is “progress” and innovation in an arrow of time manner. The future is not set in stone and things need not play out in this manner at all. Unlike the iPhone where most were excited by its possibilities (even if they traded precious privacy in the name of convenience), there’s not a clear reason that this version of LLM driven technologies represent significant upsides than downsides.


I think this was an unpopular opinion mainly because it's scary, rather than there being an obvious reason to think otherwise.


The pandas API is awful, but it's kind of interesting why. It was started as a financial time series manipulation library ('panels') in a hedge fund and a lot of the quirks come from that. For example the unique obsession with the 'index' - functions seemingly randomly returning dataframes with column data as the index, or having to write index=False every single time you write to disk, or it appending the index to the Series numpy data leading to incredibly confusing bugs. That comes from the assumption that there is almost always a meaningful index (timestamps).


> The pandas API is awful

I hate to be the "you're holding it wrong" guy but 90% of "Pandas bad!" posts I find are either outright misinformed or mischaracterizing one person's particular opinion as some kind of common truth. This one is both!

> That comes from the assumption that there is almost always a meaningful index (timestamps)

The index can be literally any unique row label or ID. It's idiosyncratic among "data frames" (SQL has no equivalent concept, and the R community has disowned theirs), but it's really not such a crazy thing to have row labels built into your data table. Excel supports this in several different ways (frozen columns, VLOOKUP) and users expect it in just about any table-oriented GUI tool.

> having to write index=False every single time you write to disk

If you're actually using the index as it's meant to be used, you'd see why this isn't the default setting.

> functions seemingly randomly returning dataframes with column data as the index

I assume you're talking about the behavior of .groupby() and .rolling()? It's never been random. Under-documented and hard to reason about group_keys= and related options, yes. But not random.

> appending the index to the Series numpy data leading to incredibly confusing bugs

I've been using Pandas professionally almost daily since 2015 and I have no idea what this means.


I think the commenter you are replying to might well understand these nuances. The point is not that Pandas is inscrutable, but instead that it‘s annoying to use in many common use-cases.


> but it's really not such a crazy thing to have row labels built into your data table.

Sometimes you need data in a certain order. Sometimes there is no primary key. And it is nuts how janky the pandas API is if you just want the index to mean the current order of the dataframe and nothing else. Oh you did a pivot? I'm just going to make those pivot columns a row label now if that's alright with you. I don't do that for all functions though, you're going to have to remember which ones. Oh you want to sort a dataframe? You better make damn sure you reindex if you're planning to use that with data from another dataframe (e.g. x + y on data from separate dataframes), otherwise I'm going to align the data on indices, and you can't stop me. Also - want to call pyplot.plot(df['column'])? Yeah I'm giving it the data in index order obviously I don't care about that sort you just did. Oh you want to port this data to excel? Well if your row labels aren't meaningful and you don't want "Unnamed: 0" you're going to have to tell me not to. You need to manipulate a multi-index? You're so cute. Have fun with that buddy.

There is a reason no other dataframe library does this - because it's confusing and cognitive overhead that doesn't need to exist. I've used pandas since ~2013, had this chat with colleagues and many recommend just giving in and maintaining an index throughout. Except I've read their pandas and it sucks because now _you_ need to reason about what is currently the index - because it actually needs to change a lot to do normal things with data. I just use .reset_index copiously and try to make it behave like a normal dataframe library because it's just easier to understand later. Pandas has not earned the right to redefine what a dataframe means.

At the absolute least, index behaviour should be opt-in, not something imposed on the user.


> After careful consideration of Oracle’s current business needs, we have made the decision to eliminate your role as part of a broader organizational change.

That is being laid off, not being fired - big difference. Being fired means being let go for poor performance / bad behaviour. No severance or grace period is necessary there (will be written in the contract). Being made redundant, particularly a redundancy of this size is quite well protected in EU. Typically negotiations between HR and representatives of the laid off group are required, you will continue to work (officially at least) until negotiations are over, as you are not officially out yet. This usually takes a few weeks.

I can tell you this from personal experience...


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: