The region is RWX, and code is put into it and then executed without a cache flush. This requires careful setup by the runtime, and here's how Rosetta does it, line by line:
1. buffer is created and marked as RW-, since the next thing you do with a RWX buffer is obviously going to be to write code into it.
2. buffer is written to directly, without any traps.
3. The indirect function call is compiled to go through an indirect branch trampoline. It notices that this is a call into a RWX region and creates a native JIT entry for it. buffer is marked as R-X (although it is not actually executed from, the JIT entry is.)
4. The write to buffer traps because the memory is read-only. The Rosetta exception server catches this and maps the memory back RW- and allows the write through.
5. Repeat of step 3. (Amusingly, a fresh JIT entry is allocated even though the code is the same…)
As you can see, this allows for pretty acceptable performance for most JITs that are effectively W^X even if they don't signal their intent specifically to the processor/kernel. The first write to the RWX region "signals" (heh) an intent to do further writes to it, then the indirect branch instrumentation lets the runtime know when it's time to do a translation.
Writing to an address would invalidate all JIT code associated with it, not just code that starts at that address. Lookup is done on the indirect branch, not on write, so if a new entry would be generated once execution runs through it.
1. buffer is created and marked as RW-, since the next thing you do with a RWX buffer is obviously going to be to write code into it.
2. buffer is written to directly, without any traps.
3. The indirect function call is compiled to go through an indirect branch trampoline. It notices that this is a call into a RWX region and creates a native JIT entry for it. buffer is marked as R-X (although it is not actually executed from, the JIT entry is.)
4. The write to buffer traps because the memory is read-only. The Rosetta exception server catches this and maps the memory back RW- and allows the write through.
5. Repeat of step 3. (Amusingly, a fresh JIT entry is allocated even though the code is the same…)
As you can see, this allows for pretty acceptable performance for most JITs that are effectively W^X even if they don't signal their intent specifically to the processor/kernel. The first write to the RWX region "signals" (heh) an intent to do further writes to it, then the indirect branch instrumentation lets the runtime know when it's time to do a translation.