Hacker News new | ask | show | jobs
by pavehawk2007 2025 days ago
Here's the specification's recommended use of SFENCE.VMA, which is just a fancier version of what I wrote above:

Page 66 and 67:

The following common situations typically require executing an SFENCE.VMA instruction:

1. When software recycles an ASID (i.e., reassociates it with a different page table), it should first change satp to point to the new page table using the recycled ASID, then execute SFENCE.VMA with rs1=x0 and rs2 set to the recycled ASID. Alternatively, software can execute the same SFENCE.VMA instruction while a different ASID is loaded into satp, provided the next time satp is loaded with the recycled ASID, it is simultaneously loaded with the new page table.

2. If the implementation does not provide ASIDs, or software chooses to always use ASID 0, then after every satp write, software should execute SFENCE.VMA with rs1=x0. In the common case that no global translations have been modified, rs2 should be set to a register other than x0 but which contains the value zero, so that global translations are not flushed.

3. If software modifies a non-leaf PTE, it should execute SFENCE.VMA with rs1=x0. If any PTE along the traversal path had its G bit set, rs2 must be x0; otherwise, rs2 should be set to the ASID for which the translation is being modified.

4. If software modifies a leaf PTE, it should execute SFENCE.VMA with rs1 set to a virtual address within the page. If any PTE along the traversal path had its G bit set, rs2 must be x0; otherwise, rs2 should be set to the ASID for which the translation is being modified.

5. For the special cases of increasing the permissions on a leaf PTE and changing an invalid PTE to a valid leaf, software may choose to execute the SFENCE.VMA lazily. After modifying the PTE but before executing SFENCE.VMA, either the new or old permissions will be used. In the latter case, a page fault exception might occur, at which point software should execute SFENCE.VMA in accordance with the previous bullet point.

1 comments

But note:

"A consequence of this specification is that an implementation may use any translation for an address that was valid at any time since the most recent SFENCE.VMA that subsumes that address. In particular, if a leaf PTE is modified but a subsuming SFENCE.VMA is not executed, either the old translation or the new translation will be used, but the choice is unpredictable. The behavior is otherwise well-defined."

This is not the "particular" case mentioned - essentially "subsumed" for satp is the entire space (you have to think of satp as simply being a non-leaf PTE)

To be fair I think that there's issues in this area that may force a double pipe flush in some implementations

I don't understand how you get that SATP is a non-leaf PTE. SATP is a register, and a fence is for memory ordering. A register is a particular kind of memory, but in this context, I do not believe the authors are talking about register memory rather than RAM itself.

The SFENCE.VMA instruction is used to force in-memory ordering, meaning that all loads and stores are completed or updated and marked dirty (aka invalid) so the MMU knows not to rely on its cache, as seen in the spec here: "The supervisor memory-management fence instruction SFENCE.VMA is used to synchronize updates to in-memory memory-management data structures with current execution." The keyword that sticks out in my mind is "in-memory". The SATP register is a register, and is not in-memory.

If a write to the SATP register requires a fence, then it should do so, much like how writing to the CR3 register in Intel/AMD X86/64 forces a flush. However, this is specifically not the behavior the authors of the specification went for to avoid one of the biggest problems with flushing every time--TLB thrashing. A fast context switch, like Linux's 1000 HZ, would mean that a larger TLB would be no help since a context switch--even to the kernel--would force a TLB flush. Furthermore, that would nullify the 5 cases that the specification lays out which would "typically" require an SFENCE.VMA.

Additionally, the specification makes clear the reason they chose this was to improve context switch performance: "We store the ASID and the page table base address in the same CSR to allow the pair to be changed atomically on a context switch. Swapping them non-atomically could pollute the old virtual address space with new translations, or vice-versa. This approach also slightly reduces the cost of a context switch." This to me means that the SATP "register" itself is immediate, whereas the memory addresses it points to (the PPN) is not. Otherwise, this couldn't possibly be the case.

The spec goes on to state that "If the new address space’s page tables have been modified, or if an ASID is reused, it may be necessary to execute an SFENCE.VMA instruction (see Section 4.2.1) after writing satp." There is nothing I can find in the specification that states that writing to the SATP register alone necessitates an SFENCE.VMA. I also can gather from context that this is on purpose in order to preserve the TLB across context switches, which is the only reason I can tell to use ASIDs in the first place.

I might be wrong, and there are a ton of issues in the github repository for this specification asking for clarification on a number of other things. I'm not sure we can divine the author's intent more than we've done here.