Latency (such as you get when communicating over the phone) makes turn taking much more difficult. Even in-person it's still a non-zero false positive rate though.
exactly when you are speaking face to face and you see person you have another visual cue from someone lips and face expression realtime and you know if someone stopped speaking or just taking some time to gather thoughts or trying to find proper word in their own head (e.g. non native speaker).