Hacker Newsnew | past | comments | ask | show | jobs | submit | bodegajed's commentslogin

Nice try boris

it is like reward hacking, where the reward function in this case the test is exploited to achieve its goals. it wants to declare victory and be rewarded so the tests are not critical to the code under test. This is probably in the RL pre-training data, I am of course merely speculating.

Even if you're a small vendor. You created an innovative product, and you tried to sell your product to a large company. Before you can be destroyed by simply showing the product to a multi-billion company. But now even medium sized companies can destroy you.


This is why I now check when I'm researching for a solution (that an LLM cannot figure out.) I go to github but often check if the project was created before 2022 due to AI slop concerns.


Yeah, it's kinda sad reality and I suddenly felt gloomy. Do you have a more optimistic view that you can share?


Let me introduce you to the decentralized alternative to ISPs, connecting and collaborating with the new-ish wireless mesh networks that are still active and maintained. The three biggest AFAIK are Freifunk (Germany), Guifi (Spain) and NYCMesh (NYC/US?).

Basically, you can as a private individual set up a wireless node, talk with your nearest node that you have a visual line of sight to, and get connected to a completely separate network from the internet, where there is a ton of interesting stuff going on, and it's mostly volunteer run.


One reason, maximizing investor value. CEO and executives usually get bonuses after layoffs.


1.5B models can run on CPU inference at around 12 tokens per second if I remember correctly.


Ingesting multiple code files will take forever in prompt processing without a GPU though, tg will be the least of your worries. Especially when you don't append but change it in random places so caching doesn't work.


A FIM or completion model like this won't have a large prompt and caching doesn't work anyways (per their notes). It'll get maybe a few thousand tokens in a prompt, maximum. For a 1.5B model, you should expect usable CPU-only inference on a modern CPU, like at least hundreds of tokens per second of prefill and tens of tokens per second of generation, which is decently usable in terms of responsiveness.


A thousand tokens (which would be on the low side) at 10-100 t/s in ingestion speed is 10-100 seconds. I don't seriously expect anyone to wait a solid minute after pressing tab for autocomplete, regular autocomplete gets unusably annoying if it takes more than a split second tbh.


Unfortunately, the main optimization (3x speedup) is using n-gram spec dec which doesn't run on CPUs. But I believe it works on Metal at least.


Brevity is the soul of wit, you did well sir.


Well said. This dream is probably for someone who have experienced the hardship, felt frustrated and gave up. Then see others who effortless did it, even felt fun for them. The manifestation of the dream feels like revenge to them.


This framing neatly explains the hubris of the influencer-wannabes on social media who have time to post endlessly about how AI is changing software dev forever while also having never shipped anything themselves.

They want to be seen as competent without the pound of flesh that mastery entails. But AI doesn’t level one’s internal playing field.


When executives fail, unfortunately, they don't blame each other. They do postmortems, then hire consultants to layoff senior engineers.


> When executives fail, unfortunately, they don't blame each other. They do postmortems, then hire consultants to layoff senior engineers.

Forced executive churn has been higher than for individual engineers at a lot of my past jobs. Especially for disciplines like marketing/advertising/sales.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: