Hacker News new | ask | show | jobs
by gpderetta 1867 days ago
Yes but then in many cases either (or both!) the caller and the callee might need to make a copy defeating the point of the optimization or even being worse than the original.
1 comments

In many cases the callee would have to make a copy, yes. However:

1. In many cases, no copy would have to be made. There are lots of small non-complex functions out there where the compiler can prove that it's safe to not make a copy.

2. In many other cases, a copy has to be made. But the copy is made by the callee, not by the caller. That means that all the instructions necessary to copy the argument ends up in the binary once in the callee, rather than once for every function call, leading to less code bloat (which has its own performance advantages).

In fact, a stupid compiler could just always make a copy without analyzing the function body. This would result in a compiler which generates code that's about as fast as it would be with current ABIs, but with a smaller size.

You have to make a copy on the caller or the callee if the address of the object escapes, so you might end up with two extra copies even if nothing in the program mutates the object.
I don't understand how you achieve extra copies? My understanding is that the caller would never make a copy, it would always pass a pointer to large structs. So the absolute worst case, unless I'm missing something, is that we end up with the same number of copies as we do today (i.e one copy per large struct passed as a parameter).

  struct A { int m; } global = {5};
  int f(struct A a) {
   global.m = 7;
   return a.m;
  }
  
  int main() {
   f(global);
   // need to make a copy of 'global' here
   // otherwise f will return 5 instead of 7
  }
Hey, I've realized that there are two understandings of the proposed ABI: One in which the only promise is that the callee won't modify the object through the pointer, and one in which the callee promises to not modify the object through the pointer and the caller promises that nothing else will modify the object. Maybe you could shed some light on it since you're the author?

In the first version, the worst case situation is that only one copy is made, and it's always made by the caller. However, the caller has to make a copy if the object is referenced after any function is called, because that function might otherwise modify the parameter if a pointer to the caller's version of the object has leaked out somewhere.

In the second version, the worst case situation is that two copies are made where old ABIs would make just one copy (if the caller has to make a copy and the callee has to make a copy). However, the callee would only have to make a copy if it actually does something which might modify the object through the pointer passed as an argument, so the optimization would apply for more functions.

I think it's fairly clear from the article that your intended ABI is the first version, due to the sentence "In the event that a copy is needed, it will happen only once, in the callee, rather than needing to be repeated by every caller" . But in this comment, you're implying that the caller makes a copy if it can't guarantee that nothing else has a pointer to the object?

I should have been clearer; my intention was your second interpretation. The copying happening only once is predicated on the assumption that the struct wasn't aliased; since it's unlikely to be aliased if you're passing it around by value.

Your first interpretation is essentially what the ms/arm/riscv abis do. The reason I don't think that works as well is—

In general, it's rare for functions to mutate their parameters by value. We can effectively treat this as an edge case, and 'compensate' by making copies in the callee when necessary. But, when does the caller need to make a copy?

Version 1: whenever the object is aliased before the call, or read from after it

Version 2: whenever the object is aliased before the call

I think using the same struct multiple times is something that happens relatively frequently, so compared with v1, v2 elides a lot of caller-side copies. In exchange, it adds a relatively small number of callee-side copies. Which, despite the few pathological cases, seems likely to be overwhelmingly worth it most of the time.

If the address of the object escapes on the caller side then it has to make a copy as the object could be mutated or even just break the distinct address guarantee of the language.
I still don't understand, sorry. If the callee does something which could cause the caller's object to change, such as calling an unknown function or modifying through another pointer which might alias the parameter, the callee would just have to make a copy.

Could you provide an example of a situation where there would be more copies made using the proposed ABI than in traditional ABIs?

Sure, if calling any external function or writing though any pointer would force the callee to copy the object then yes you can have only the callee do the copy, but then it seems that this optimization would apply only to a very small subset of functions.