Hacker News new | ask | show | jobs
by substack 1735 days ago
The website hasn't been updated in a while and the stack somewhat diverges from what is written there but we're very hard at work making all the pieces fit together. The p2p database (eyros) works pretty well with only some transfer size improvements left. The database is fully symmetric and runs fully in the browser with a ~400kb wasm build (will work on getting this down later). And the rendering stack works pretty well. The main hurdle at the moment which I have been working on is the ingest phase which consumes planet-osm.pbf and writes into the spatial database (eyros). I can process all the nodes in 1 hour and all the ways in 35 hours on a not very expensive vps but processing the relations I still run out of memory. Probably in the next few weeks I can get this ingest phase working and we will have an initial data release. There is still some rendering work with polish and label rendering but it basically works.

The main initial benefit for the project is to have completely free embeddable web maps hosted p2p (using ipfs, hyperdrive, webtorrent) where you can entirely customize the rendering. Then later the benefits of the fully symmetric nature of the database will start to make more sense and the project has the potential to become less centralized on open street map servers and data. For some prior work that I and other people who work on peermaps have worked on check out https://mapeo.world/ and this very old writeup I made about an early version of the osm-p2p database. The experiences of working on that project and prior versions heavily informs how the current peermaps stack works.

You can check out a more up to date version of the progress on this talk we gave for speakeasyjs recently: https://www.youtube.com/watch?v=P7X7C-door4

Or here is a slightly old slightly broken version of panning across a processed version switzerland using the end-to-end stack on my laptop (it looks better now) https://www.youtube.com/watch?v=gHEmmQ6GnDI

1 comments

This is incredible. How big of a VPS would you need to preprocess the whole dataset now?

If I use peermaps and zoom to a particular city, how does it find peers that have that particular part of the db?

It's similar how with a torrent you can start seeking into a particular spot in a file and start playing by requesting particular chunks at that spot. Some clients like webtorrent support this behavior but it changes the dynamics of the network somewhat if many clients do this kind of thing. You can build some supplementary peer info to help the process along for different p2p networks depending on if they let you create side-channels or let you make more explicit connections to peers. For peermaps, the database is file and directory based so most of that peer tree traversal should be handled already by the network. And there are more ways to optimize the connections with additional tricks once you get the basics working with a somewhat slower and less sophisticated transfer method.
The VPS we're running on has 60GB of RAM which should be plenty but the ingest program needs more work to use less memory so it stops crashing when denormalizing multipolygon relations, which involves denormalizing ways which fetch nodes... all referenced by ID which has not much locality spread across the pbf file. And if you write to temporary storage it can use a lot of disk and denormalization based on the on-disk format can get really slow. It's just all very tricky to get working well within reasonable time constraints (less than a week of processing ideally) and a reasonable memory footprint.