Street view is built via a camera mounted on a truck driving on the street. I see no reason that the captured frames wouldn't include normal traffic flows.
Since its stop motion, you wouldn't see the morphing transition, they would wait for that to finish and then take the picture.
Not sure if that is what they actually did, but playing around with it a bit on my own it looks like you could get some decently fluid looking motion by taking screenshots of each "frame" that street view provides.