One requires 2 people in person to engage, and is usually projected in a general direction which limits sounds. The other, only 1 person has to be present and sound may go in any direction.
Alternatively, an irl conversation has two talkers which doubles noise, has to be louder than someone on a phone because you're taking to a receiver further away, and both conversers have to be directing sound in the direction of where people are, as opposed to a phone which can be used while directing sound at e.g. the wall