OpenStreetMap has the building outlines (sometimes even parts of the buildings) as well as tagging that specifies
• wall and roof color
• wall and roof material
• roof shape
• height (or levels, which
typically is then multiplied
with 3 m to get a rough height)
• starting height (or level), for
building parts above ground, e.g.
bridges between buildings.
This can then get rendered as upwards extruded polygons with a cap in the shape of the roof. There are a few more complicated ways of specifying complex roofs with ridge lines, etc. but few renderers support them and usage is fairly limited.
I think it must be using the building outline data (and metadata like the heights) that people have entered, because in my city the inner-city has very complete data but a bit further out, around where I live there aren't any building outlines on OSM and there's nothing in this 3D map apart from just the streets.
The result is very impressive given that seems to be the case!