Then your problem isn't the medium. Your problem is a lack of trust and to rely too much on facial signal that, in other scale of things, are a very bad way to measure your collaborators
Facial and body language is a HUGE part of in person communication. For better or for worse, that is just how the vast majority of humans are wired. If you willfully ignore these signals you WILL be misunderstood and you WILL misunderstand others. I hate that things are this way because of how much effort it takes for me to decipher these cues when a neurotypical person gets it from intuition, but it absolutely does exist and isn't going away any time soon.
That's perhaps phrased a bit less charitably than necessary, but gets at an important truth. People who rely too much[1] on these non-verbal cues are, more often than not, doing so because they're not adept verbally. It's kind of like a fortune teller, who of course does not know you or your future but can put up a pretty convincing front by observing responses to their initial probes. I see it a lot among people for whom English is not their first language, just as I see the same people make just about any excuse to get out of writing anything down permanently. Since effective remote work also has to be asynchronous work as much as possible, I'd say these people need to work on their own language skills instead of complaining about how the online experience doesn't perfectly support their coping strategies.
[1] How much is "too much"? There's plenty of room for debate, but a decade of alone-remote and a year of all-remote made it pretty clear that it's a threshold many of my colleagues at multiple companies exceed.
Last time I checked, I'm a human being who is hardwired to understand these social cues. They're essential for having a conversation in any way that isn't just exhausting for me. It's not a lack of trust. My monkey brain just struggles to parse remotely held conversations.
Yes, they're useful, but I challenge the claim that they're essential. People have been communicating effectively over both time and distance for centuries, using media where these cues are absent. They're nice to have, they can make things easier and improve comfort/trust levels, but whether you rely on them is up to you. Lots of people at all levels of language competency and introversion etc. manage to collaborate just fine, even without any form of video at all.