X86 does not require any explicit barrier if you modify through the same virtual...

chrisseaton · on Nov 30, 2021

Not sure which bit you’re saying ‘no’ to.

Most JITs do execution an icache flush, and Rosetta does catch it to invalidate their code.

For example https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x...

Otherwise, how do you think it works?

saagarjha · on Nov 30, 2021

x86 does not require an icache flush because it has a unified cache. Rosetta emulates this correctly, which means it must be able to invalidate its code without encountering such an instruction.

chrisseaton · on Nov 30, 2021

> x86 does not require an icache flush

It does if you wrote instructions from one address and execute them from another, which is why they use a flush.

> Rosetta emulates this correctly

Maybe you know more than I do, it my understanding is it does not emulate it correctly if you do not flush or change permissions.

How do you think it detects a change to executable memory without a permissions change or a flush?

saagarjha · on Nov 30, 2021

Rosetta needs to support code that looks like this:

  char *buffer = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
  *buffer = 0xc3;
  ((void (*)())buffer)();
  *buffer = 0xc3;
  ((void (*)())buffer)();

The region is RWX, and code is put into it and then executed without a cache flush. This requires careful setup by the runtime, and here's how Rosetta does it, line by line:

1. buffer is created and marked as RW-, since the next thing you do with a RWX buffer is obviously going to be to write code into it.

2. buffer is written to directly, without any traps.

3. The indirect function call is compiled to go through an indirect branch trampoline. It notices that this is a call into a RWX region and creates a native JIT entry for it. buffer is marked as R-X (although it is not actually executed from, the JIT entry is.)

4. The write to buffer traps because the memory is read-only. The Rosetta exception server catches this and maps the memory back RW- and allows the write through.

5. Repeat of step 3. (Amusingly, a fresh JIT entry is allocated even though the code is the same…)

As you can see, this allows for pretty acceptable performance for most JITs that are effectively W^X even if they don't signal their intent specifically to the processor/kernel. The first write to the RWX region "signals" (heh) an intent to do further writes to it, then the indirect branch instrumentation lets the runtime know when it's time to do a translation.

chrisseaton · on Nov 30, 2021

That’s a more limited case than what we’re talking about in this thread.

Think about code that is modified without jumping into it, such as stubs that are modified or certain kinds of yield points.

saagarjha · on Nov 30, 2021

Writing to an address would invalidate all JIT code associated with it, not just code that starts at that address. Lookup is done on the indirect branch, not on write, so if a new entry would be generated once execution runs through it.

anyfoo · on Nov 30, 2021

> How do you think it detects a change to executable memory without a permissions change or a flush?

One way how this could be implemented was the way mentioned above: By making sure all x86-executable pages are marked r/o (in the real page tables, not from "the x86 API"). Whenever any code writes into it, the resulting page fault can flush out the existing translation and transparently return back to the x86 program, which can proceed to write into the region without taking a write fault (the kernel will actually mark them as writable in the page tables now).

When the x86 program then jumps into the modified code, no translation exists anymore, and the resulting page fault from trying to execute can trigger the translation of the newly modified pages. The (real, not-pretend) writable bit is removed from the x86 code pages again.

To the x86 code, the pages still look like they are writable, but in the actual page tables they are not. So the x86 code does not (need to) change the permission of the pages.

I don't know if that's exactly how it is implemented, but it is a way.