Hacker Newsnew | past | comments | ask | show | jobs | submit | andybak's commentslogin

What is the aim of "moral instruction" if not deterrence? Surely it needs be instruction in pursuit of an outcome?

It makes honest people feel rewarded, valued and acknowledge. It teaches people who wish to follow the rules and conform to social norms what those norms are and where we actually draw the line in practice.

https://en.wikipedia.org/wiki/Punishment#Education_and_denun...


Looked at slightly differently, given a split between high trust and low trust preventing conversions from high to low is similarly important to inducing conversions from low to high.

Even for games or experiences with no artificial locomotion whatsoever?

Yes, my understanding (and I suspect the reason why the airflow experiment worked) is that a large part of the reason this happens is because of a mismatch between the output from the vestibular and visual systems. So, the automated defenses of your body freak out and go into a defensive mode.

I think that ~30% of the population just has more sensitivity to the mismatch.


But surely that requires (virtual) player movement in VR for there to be a mismatch?

There is always going to be some movement. It’s impossible for there not to be. Whether it is rendered in the VR environment or happening in real-life through small little motions, there’s a lot of little things that help to establish the mismatch.

It’s probably most like getting car sick. You are obviously moving, but you are also stationary at the same time. This doesn’t happen to folks suffering from motion sickness when they are driving, though, because there is now a physical action tying the motion to your inputs.

This may lead you to ask why people watching a movie in a theater don’t get motion sick and the reason is the same, multiple inputs tell you otherwise. You can see the edges of the screen, you can see the audience, there’s a lot of input telling your body there’s nothing weird going on here. The more immersive, the more some people’s bodies do not handle the illusion well.


> solved

Have you considered that it's unsolvable? Or - at least - there is an irreconcilable tension between capability and safety. And people will always choose the former if given the choice.


in a pure sense no, it's probably not solvable completely. But in a practical sense, yes, I think it's solvable enough to support broad use cases of significant value.

The most unsolvable part is prompt injection. For that you need full tracking of the trust level of content the agent is exposed to and a method of linking that to what actions it has accessible to it. I actually think this needs to be fully integrated to the sandboxing solution. Once an agent is "tainted" its sandbox should inherently shrink down to the radius where risk is balanced with value. For example, my fully trusted agent might have a balance of $1000 in my AWS account, while a tainted one might have that reduced to $50.

So another aspect of sanboxing is to make the security model dynamic.


I don't know about solved, but I've seen some interesting ideas for making it safer, so I think it could be improved.

One idea is to have the coding agent write a security policy in plan mode before reading any untrusted files:

https://dystopiabreaker.xyz/fsm-prompt-injection


I am experimenting [0] with compiling markdown to a DSL first. Then running a static analysis on the DSL code. Still at an early stage though.

[0] https://deepclause.substack.com/p/static-taint-analysis-for-...


> The issue title was interpolated directly into Claude's prompt via ${{ github.event.issue.title }} without sanitisation.

How would sanitation have helped here? From my understanding Claude will "generously" attempt to understand requests in the prompt and subvert most effects of sanitisation.


I would not have helped. People are losing their mind over agents "security" when it's always the same story: You have a black box whose behavior you cannot predict (prompt injection _or not_). You need to assume worst-case behavior and guardrail around it.


I don't even think there is a sound notion of "sanitization" when it comes to LLM input from malicious actors.


You can sanitise a lab, but not a sewer.


And yet people keep not learning same lesson. It's like giving extremely gullible intern that signed no NDA admin rights to your everything and yet people keep doing it


What was the injected title? Why was Claude acting on these messages anyway? This seems to be the key part of the attack and isn’t discussed in the first article.


> Why was Claude acting on these messages anyway?

Because that's how LLMs work. The prompt template for the triage bot contained the issue title. If your issue title looks like an instruction for the bot, it cheerfully obeys that instruction because it's not possible to sanitize LLM input.


8gb is (not) enough for anyone


oh dear god. i can port this to VR now... Claude!


Why are they using JSON in the context? I thought we'd figured out that the extra syntax was a waste of tokens?


Because you might not have a build script?


Then how is anything ending up in the build directory?


Then why do you need a build directory?


qemu: mkdir build; cd build; ../configure, some projects are like that


Why can’t the configure script do this?


Because the standard for configure scripts says that the current directory at invocation is the build directory and the location of the configure script is the source directory. You are expected to be able to have multiple build directories if you want them. If you have written your configure script correctly, than in-tree builds (srcdir == builddir) also work, but most people don't want that anyway.


You can. But this makes intent clear. If you clone a git repo and see build/ with only a gitkeep, you are safe to bet your life savings on that being the compiled assets dir.


But how do you stop the boring and depressing - and abusive and manipulative parts?

I'm not saying legislation is a good solution but you seem to be making a poetic plea that benefits the abusers.


>I'm not saying legislation is a good solution but you seem to be making a poetic plea that benefits the abusers.

Only if you believe everyone else has no agency of their own. I think most people outgrow these things once they have something more interesting in their lives. Or once they're just bored.

Back when this thing was new, everyone was posting pictures of every food item they try, every place they've been to etc.. that seems to slowly change to now where there are a lot more passive consumers compared to a few polished producers.

If you're calling people delivering the content "abusers", what would you call people creating the content for the same machine?


I don't believe people have no agency.

But I do believe we overestimate our own agency. Or more importantly society is often structured on the assumption that we have more agency then we actually do.


We have agency but it is almost trivial to hijack.

Setting up the argument between agency/no agency misses the point IMO.


because some people suffer from mental health issues and need help and encouragement to break these behaviours.

And companies should not be allowed to predate on the vulnerable.


where does it stop though? I suffer from cant-stop-eating-nutella but should we shut down ferrero? it is simply not possible to protect the vulnerable in a free society. any protection only gives power into the wrong hands and will eventually get weaponized to protect “vulnerable” (e.g. our kids from learning math cause some ruling party likes their future voters dumb)


Dumb argument. They don’t intentionally make Nutella addictive and then test out recipes on the public to make it even more addictive. Other people can’t stop eating ice cream or oranges or salami.


Except that's sort of... exactly what they do.

The food industry has pretty much invented the whole process of making "addictive" products and then "test[ing] out recipes on the public to make it even more addictive". Of course, we usually call it making products that taste good, and running taste panels with the public for product development (making a new tasty thing), quality control (ensuring the tasty thing stays tasty), and market research (discovering even tastier things to make in the future). Each part of it employs all kinds of specialists (and yes, those too - nutrition psychology is a thing).

The process is the same. The difference between "optimized for taste" and "addictive" isn't exactly clear-cut, at least not until someone starts adding heroin to the product (and of the two, it's not the software industry that's been routinely accused of it just for being too good at this job).

Not defending social media here in any way. Cause and effect is known these days, and in digital everything is faster and more pronounced. And ironically, I don't even agree with GP either! I think that individuals have much less agency than GP would like it, and at the same time, that social media is not some uniquely evil and uniquely strong way to abuse people, but closer to new superstimulus we're only starting to develop social immunity to.


Nothing like reading Dumb argument followed by the dumbest sentence I've read here this month (which is ... something :) )


I would say the core problem is that we lack a goal as society. If you only care about making money stuff like this happens regardless how many regulations you do.


"Dumb" and "insane" are thoughtless and shallow positions to take.

It's fine to disagree with the EU's stance (I probably do. I'm not sure yet) but it's not a good look to dismiss it without some recognition that a reasonable person might think this is a worthwhile position to take given the known harms of social media.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: