Wow - how does it do that?! Image recognition techniques alone (as far as I know) couldn't do that, right? So how does it know how far away something is?
My guess is they estimate one or several dominant scene planes from the sparse triangulated feature points and get scale through incorporating accelerometer measurements.
That's pretty cool. I almost wonder if someone could use this type of tech for a cheap way of mapping the inside of structures/homes. Something like an ARKit -> Sketchup service. That or a good way to monitor job-sites for correctness, remotely.
It's already possible to do this in a basic sense with Canvas (uses Structure sensor) and floorplans have been done since 2014 with RoomScan app.
What apple and these apps don't want to say is that they absolutely can't be used for measuring correctness, the calibration and accuracy is just not that good.
Yeah, even though consumer AR (= overlay graphics on picture of the world) hasn't really taken off, it it is being adopted by industries that build large things with people like buildings, ships, etc.