lock-free ring buffers still have locks for the case when they are full, so it would be interesting to see how the kernel behaves when you never read from the completion ring.
As long as command submission is only permitted (or considered valid) when the submitter ensures there is sufficient space in the completion ring for the replies, writing to the completion ring can't legitimately block, and the completion writer can assume this.
This asymmetry is because the command submitter is able to block and is in control. It even works with elastic ringbuffers that can be expanded as needed. All memory allocation can be on the command submitter side.
This trick works in other domains, not just kernels. For example request-reply ringbuffer pairs can be used to communicate with interrupt handlers, in OSes and also in bare metal embedded systems, and the interrupt handler does not need to block when writing to the reply ringbuffer. It's a nice way to queue things without locking issues on the interrupt side.
Similarly with signal handlers and attached devices (e.g. NVMe storage, GPUs, network cards), those can can always write to the reply ringbuffer too.