No, it's an inherent problem whenever a packet-switched, multi-user, shared channel inherently built around queued transmission tries to emulate real-time protocols, the earliest of which is "a directly-connected wire."
This sort of substitution can work, but it requires very stable, low-noise links (that is, with little to no burst noise) and the cooperation of devices and network equipment.
...Okay, how would you suggest a better protocol avoid that problem? For voice/video calls, you need a fairly small buffer size so that latency isn’t distracting. But if you have a (say) 250ms buffer, and the wireless connection drops out for 500ms, then there’s no way to avoid losing some packets.
TCP’s reliability comes at the cost of increased, inconsistent, and unpredictable latency; which is fine for downloading a file, browsing the web, or streaming video with a 30-second buffer; but unacceptable for a real-time call.
This sort of substitution can work, but it requires very stable, low-noise links (that is, with little to no burst noise) and the cooperation of devices and network equipment.