> Well no, you still want to minimize garbage production (and related GC overhea...

epcoa · on Dec 29, 2023

I can’t state any more clearly: minimize the production of garbage.

“Surely some % of garbage helps as you're garbage collecting ”

Garbage that doesn’t exist doesn’t need collecting.

The flaw here is confusing free space with garbage. You shouldn’t have written in the first place if you could have avoided it.

Every environmentalist knows this: RRR, the first R is reduce not recycle.

dragontamer · on Dec 29, 2023

Any append-only data-structure will have data that was true (when it was written), but has become false/obsolete/garbage at a later time. This is unavoidable.

I'm not saying that we write needless garbage to the logs or filesystem or whatever. I'm saying that the amount of garbage in your stream that you leave behind aids in later steps of (static) wear-leveling. So therefore, its not a big deal. You're going to be making plenty of this (initially true data, but later garbage data) as files get updated, moved around filesystems, or whatnot.

"Garbage" in this sense is perhaps the wrong word. Its more like "Obsolete data", or "outdated data".

ncruces · on Dec 29, 2023

If you read the article though, many of the updated nodes (which are now garbage) don't see any updates to their “data” but to internal tree pointers.

So lots of “data” is being copied, and garbage is being generated, only for the benefit of tidying up tree structure, not because the actual “data” in those pages changed.

Not generating such garbage in the first place is an obvious benefit.

dragontamer · on Dec 29, 2023

I think I see what you mean now. Thanks.