Indeed, and with the work done by Guix and the Reproducible Builds project we do have a real-world example of diverse double compilation which is not just a toy example utilizing the GNU Mes C compiler.
Projects like GNU Mes are part of the Bootstrappable Builds effort[0]. Another great achievement in that area is the live-bootstrap project, which has automated a build pipeline that goes from a minimal binary seed up to tinycc then gcc 4 and beyond.[1]
I feel the need to point out that the "Bootstrappable Builds" project is a working group from a Reproducible Builds project which where interested in the next step beyond reproducing binaries. Obviously this project has seen most effort from Guix :)
The GNU Mes C experiment mentioned above was also conducted during the 2019 Reproducible Builds summit in Marrakesh.
In principle, diverse double-compiling merely increases the number of compilers the adversary needs to subvert. There are obvious practical concerns, of course, but frankly this raises the bar less than maintaining the backdoor across future versions of the same compiler did in the first place, since at least backdooring multiple contemporary compilers doesn't rely on guessing, well ahead of time, what change future people are going to make.
Critically, it shouldn't be taken as a demonstration that the toolchain is trustworthy unless you trust whoever's picking the compilers! This kind of ruins approaches based on having any particular outside organization certify certain compilers as "trusted".
There is an uphill effort here to actually do this. While theoretically a very informed adversary might get it right first time, human adversaries are unlikely to and their resources are large, but far from infinite.
Your entire effort is potentially brought down by someone making a change in a way you didn't expect and someone goes "huh, that's funny..."
Quite frankly, I'm surprised that is hasn't come up multiple times in the course of getting to NixOS and etc. The attacks are easy to hide and hard to attribute.
Programs built by different compilers aren't generally binary comparable, e.g. we shouldn't expect empty output from `diff <(gcc run-of-the-mill.c) <(clang run-of-the-mill.c)`
However, the behaviour of programs built by different compilers should be the same. Run-of-the-mill programs could use this as part of a test suite, for example; but diverse double compilation goes a step further:
We build compiler A using several different compilers X, Y, Z; then use those binaries A-built-with-X, A-built-with-Y, A-built-with-Z to compile A. The binaries A-built-with-(A-built-with-X), A-built-with-(A-built-with-Y), A-built-with-(A-built-with-Z) should all be identical. Hence for 'fully countering trusting trust through diverse double-compiling', we must compile compilers https://dwheeler.com/trusting-trust/
https://dwheeler.com/trusting-trust/#real-world