However, failing to ABX those specific samples does not guarantee you are unable to tell the difference in all circumstances. There are some sounds that are unusually difficult to encode ("killer samples"). This is an especially big problem for MP3. The LAME project has a collection of killer samples for MP3:
You can test this. Get a bunch of audiophiles in a room, play various recordings from various media (24KHz wav, mp3 at good bitrates and encoding settings, bad mp3, ogg, CD, vinyl etc) without showing them which is which, ask which one they like the most and see if the results correlate with the supposed quality of the source.
I don't have a source ready but this has been a hot topic in audiophile land for decades and tldr is they'll pick out the really bad sources (eg <128kbps mp3) but not the rest. Basically the results look like those from a blind beer tasting test: no correlation between winner and supposed quality, except if the quality is especially bad.
I'm no scientist, but to me "audiophiles who really care about this stuff can't pick out the good MP3 from the uncompressed original" is sufficient proof that MP3 is, actually, based on a sufficiently well-understood model of human hearing.
https://en.wikipedia.org/wiki/ABX_test
A properly conducted ABX test is the most favorable condition possible for detecting a difference. If you can't ABX it you can't hear it.
There's an ABX testing website with various lossy formats you could try:
https://abx.digitalfeed.net/list.html
However, failing to ABX those specific samples does not guarantee you are unable to tell the difference in all circumstances. There are some sounds that are unusually difficult to encode ("killer samples"). This is an especially big problem for MP3. The LAME project has a collection of killer samples for MP3:
https://lame.sourceforge.io/quality.php
More modern lossy formats are less susceptible to killer samples, but theoretically there could still be problematic cases.