More

LaserToy · 2025-11-09T19:19:56 1762715996

100%, and we have all of those things. Canary acts as the last line of defence, and honestly, when Canary detects and rolls back, it is already an incident that is being auto-mitigated with a limited blast radius.

To reduce the potential blast radius, we are working on a cohort-based canary, which will allow us to validate against a minimal, stable subset of traffic with the desired properties.

LaserToy · 2025-10-29T16:49:39 1761756579

Azure portal still insists the issue is jsut with Console.

We had to bypass the Frontdoor

LaserToy · 2025-10-23T16:09:33 1761235773

TLDR: A DNS automation bug removed all the IP addresses for the regional endpoints. The tooling that was supposed to help with recovery depends on the system it needed to recover. That’s a classic “we deleted prod” failure mode at AWS scale.

LaserToy · 2025-10-21T17:01:26 1761066086

The architecture of our in-house Rust based logging engine

LaserToy · 2025-07-17T15:45:35 1752767135

Cloudkitchens use them as well: https://techblog.cloudkitchens.com/p/ml-infrastructure-doesn...

They call it a DREAM stack (Daft, Ray Engine or Ray and Poetry, Argo and Metaflow)

vibecodemaster · 2025-07-17T18:18:27 1752776307

There's actually a lot of companies using Metaflow, big and small: https://outerbounds.com/stories

LaserToy · on March 6, 2025

Same folks presented DREAM stack and some use cases at Ray Summit last year: https://www.youtube.com/watch?v=zaaKT0IyutQ. and https://www.youtube.com/watch?v=McjH0WfdAyI

LaserToy · on Dec 2, 2024

I love actors as a concept and I heard some large companies (Expedia) implemented large parts using them.

But I also saw how hard it is to understand a large system that built using actors. It is just hard to comprehend all the communication pathways and what happens in the system.

jspdown · on Dec 2, 2024

When the design closely aligns with the real world problem it solves, communication pathways are natural and you don't really have to care much about them. What matters is the Actor's role and making sure it represent a strong domain concept. The rest follows naturally.

But to be fair, it's never that simple and you always end up with some part of a system that's less "well-designed". In that case,figuring out who talks to who can quickly become a nightmare.

Actors are great on the paper, but to benefit from them, you need great understanding of your domain. I tend to use it later in the development process, on specific part where the domain is rich and understood.

jghn · on Dec 2, 2024

> It is just hard to comprehend all the communication pathways and what happens in the system.

Having worked on large scale actor-based systems before, I'll attest this is quite true. However, what often gets lost in these conversations is that this is also true of large scale OOP based systems as well.

If one takes a few steps back and squints, there's really not much difference between Objects and Actors: in both cases you have a namespaced entity (object, actor) that receives signals via some defined mechanism (methods, messages) which lead it to perform some action.

rdtsc · on Dec 2, 2024

> But I also saw how hard it is to understand a large system that built using actors.

Indeed, it can be just as much of a spaghetti mess as any other code, but it becomes easier if actors are the preferred abstraction for a platform already, for instance as it is for Erlang/Elixir on the BEAM VM.

The platform comes with a few benefits such as:

  1) Immutable data: inside each actor the state is explicitly evolved from one message to the next. It's passed as an explicit argument to functions. Erlang is even better as the variable binding itself is immutable.

  2) Isolated heaps: actors all have isolated heaps. You can have millions of them per OS process and they can't reach in and modify each other's memory. They have to send and receive a message.

  3) Supervision trees: actors that work together can be grouped into a tree hierarchy so that if one starts, it start the others and they have "links" between them. If some crash, others crash with them. After the crash they can be restarted safely. It can be done safely because they have isolated heaps. Restarting a bunch of OS threads in a regular C/Java/etc program cannot be done safely, usually. These supervision hierarchies is how the system can be organized. A top level actor might serve as the API endpoint for its children so message go through it.

  4) Tracing/live debugging: every message that is sent or function call can be traced dynamically by connecting to a live system. That can be helpful of making sense of the mess when debugging.

There are many "actor" systems out there. It's not a big deal to write a function to send a message to a lockless "mailbox" to be received by a thread in pretty much any modern language/platform. Doing that seems like it gets you 90% there to "actors", but without those 4 points above it only gets there 10% of the way. You can build a quick demo, but it would become a nightmare in a production system.

LaserToy · on Nov 7, 2024

Amazon

LaserToy · on Nov 4, 2024

Can anyone talk about a better model?

Some things to consider: 1) The company needs a way to weed out folks who are net negative. In general, if someone is not playing their part, there should be a mechanism to evict if up-leveling fails. 2) The company needs a way to distribute incentives (bonuses) as fairly as possible.

gtramont · on Nov 5, 2024

The major problem I see with these "performance reviews" is putting everything into the same bucket: feedback, compensation, career ladder progression—which I could go on about, but this isn't the point I'm trying to make.

In the end of the day, what _really_ matters is if people are getting compensated fairly compared to their peers. Sure, some people like to play the power game and get excited with becoming a "newly-made-up-title-that-sounds-important-but-I-dont-get-paid-more". But these people are only playing a game, very likely unawarely, that was already set by the company.

It all comes down to how a company lays out its incentive models. And, truth be told, the vast majority of the software companies out there do a terrible job at it. The people in "charge" don't know better and end up replicating what others do: a Taylorist approach (https://en.wikipedia.org/wiki/Scientific_management). It goes without saying that, for a company that requires knowledge work, this isn't the best approach. A lot of perverse incentives crop up (https://en.wikipedia.org/wiki/Perverse_incentive).

A better model, from my perspective, is one that dissociates feedback from compensation. This usually goes hand-in-hand with a more transparent culture; with self-managing and self-organizing companies:

- https://en.wikipedia.org/wiki/Maverick_(book) - https://www.reinventingorganizations.com/ - https://mooseheadsonthetable.com/ - https://www.humanocracy.com/

Team-set salaries is one that I like a lot. Unfortunately, it isn't as wide spread. Here's a few more resources on it:

- https://www.percival.live/post/team-set-salaries-tss - https://www.infoq.com/news/2022/03/tss-company-wide-compensa... - https://www.youtube.com/watch?v=M1rMMmO_iO0

Hopefully I planted a few seeds.

LaserToy · on Oct 12, 2024

How much are you paid? Total comp?

benterix · on Oct 12, 2024

I'm reluctant to provide details as these days you can be doxxed by just one's writing style but I manage to save 85% of my salary and rent a private 50sqm office with all amenities just for myself.

LaserToy · on Oct 12, 2024

Was trying to gauge whether you are underpaid because you only accept remote work.

Economy will be the one that will determines the ultimate outcome, especially for tech companies, as they compete globally