cs_loser's comments

cs_loser · on Jan 5, 2011

Krulwich is also the clown who had his ass handed to him by none other than Neil Armstrong last month:

http://www.npr.org/blogs/krulwich/2010/12/08/131910930/neil-...

cs_loser · on Oct 28, 2010

Were I asked this, my general approach would be: (1) Given the available building blocks (disks, ram, cpus, etc) what strategies are workable? An example of a strategy would be "store it in a btree and keep all but the last two levels in ram; therefore we'll need enough ram for the top of the tree and 2x seeks for each lookup." (2) Find existing software that implements a workable strategy if at all possible, or at least the complicated components of it and code the rest yourself.

You can think of two extreme answers to this as "buy oracle" and "buy some transistors". Those are both bad answers. The first is a bad answer because, unless the candidate can answer the follow-up question "how does oracle work?" it probably just amounts to wishful thinking and they almost certainly won't be able to say when it will fail. The second is a bad answer because you can't do that in two weeks by yourself.

Take for example, Foursquare's recent outage. From what I read online, it seems like they went with "use MongoDB" as a strategy without actually understanding what Mongo was doing under the hood. Presumably they didn't figure out how it would fail (the author's follow-up question, natch) and thus they discovered that the hard way. So they chose bad answer one. On the other hand, had they decided to implement their own database, they'd know all about its characteristics, but they wouldn't have launched yet.

variety · on Oct 28, 2010

Or, they simply didn't have failover in place at all.

cs_loser · on Oct 20, 2010

Is it pretty common? As someone who does a fair number of software engineer interviews, that's a trick question. The real answer to "swap two vars with no temps" is:

1. Don't be clever in our code base. Use a temp variable. 2. There's various dumb tricks with XOR, and possibly add/subtract if overflows don't break. 3. A sequence of several instructions where each of them requires the result of the previous one may not execute particularly fast on modern processors. Instruction/cycle counts -- like 3 -- are great when there's no pipeline and no cache, but otherwise pretty much useless. 4. The things you're swapping might be local variables, and when the compiler has -O <anything> specified, local variables start getting weird, and "swap" can sometimes be done in zero instructions, namely by the compiler noting that they have now been swapped and using the other one for the rest of the basic block. (or further dominated basic blocks for that matter) 5. If the things you're swapping are in main memory, or even if it's not in L1, you're going to be incurring a cost much greater than the temporary use of a register. (and, if you don't know where they are and it might be main memory, this might dominate the average runtime)

The answer is definitely not "three xors".

mentat · on Oct 20, 2010

Embedded and firmware engineering questions expect this as an answer. Anyone following your advice will not be taken seriously. This is true regardless of whether you're correct factually. Readers of this thread deserve to know that.

leif · on Oct 21, 2010

3 is actually a quite valid point for embedded. Any swap actually will be expensive when it comes to keeping cache lines clean. The correct answer in that case is just to not swap the variables, and instead swap their uses later on:

    int x, y;
    ...
    SWAP(x, y);
    foo(x, y);

becomes

    int x, y;
    ...
    foo(y, x);

(naturally, this is why I still eagerly await the arrival of a C compiler that has macros with LISP power)

pjscott · on Oct 21, 2010

In that case, you can get the right behavior by just swapping the variables using a temporary variable. If the compiler is decent, it'll automatically swap their uses later on.

I don't know how every compiler works, but if you use Clang (or anything LLVM-based), it converts everything to Single Static Assignment (SSA) form:

    int x1 = 42, y1 = 666;
    ...
    int tmp = x1; x2 = y1; y2 = tmp;  // SWAP(x, y)
    foo(x2, y2);

In SSA form, the value of a variable does not change, so it ends up creating a bunch of "imaginary" variables to hold intermediate values. From there, it does optimizations, then figures out how best to allocate registers, and what needs to be stack-allocated.

leif · on Oct 21, 2010

Cool, thanks. I know far too little about compiler optimizations, it's always nice to hear about them.

ay · on Oct 21, 2010

It's already there:

http://chaos-pp.cvs.sourceforge.net/chaos-pp/order-pp/exampl...

But I do not envy the poor soul who would have to maintain all this cleverness.

TimMontague · on Oct 20, 2010

Most of the hacks in that guide assume that you don't care about readability or even portability to a certain extent. It certainly isn't everyday that you need to optimize your code at that level, but in some instances it could be useful (for example trying to reduce delay in a real-time program.)

IgorPartola · on Oct 21, 2010

The question I've heard is to just swap 2 variables, no restrictions.

cs_loser · on Sept 22, 2010

CPU contention is one of the least noticeable overload situations on a computer because schedulers have hidden it well since forever. Even if your video playback and your benchmark are at the same nice-value, the scheduler notices that the benchmark is using lots more CPU, so it gets effectively "niced" compared to the video playback, which then gets the CPU whenever it's runnable.

If someone swapped out your RAM for something less than the working set of the programs you have running -- you would notice. Also, if it were an I/O benchmark instead of a CPU benchmark -- you would notice. (Yes, there are I/O schedulers now, that help somewhat.) I/O and memory are not as trivial for a kernel to "make room" in by kicking out other programs.