The standard would probably need to introduce destructive move semantics and trivially relocatable types (unique_ptr for example) to enable simplifying the ABI for such types. Then of course an ABI break is required to actually apply the changes.
I don't know good ways to replace unique_ptr here.
Requiring library users to #include the concrete implementation is not good: inflates compilation time, pollutes namespaces, pollutes IDE's autocompletion DB.
Can go C-style i.e. pass double pointer argument to the factory, or return raw pointer. But then user needs to remember to destroy the object.
It's the overhead VS. passing a raw pointer. The itanium ABI says that std::unique_ptr has to be passed by address due to its special member functions (the ABI doesn't know if it stores a pointer to itself).
Compilers have an attribute to remove this overhead, but it's an ABI break to do it.
Simple attempts to fix don't really work. Not even sure an ABI break will be enough, but it would at least be a minimum requirement.