Hacker News new | ask | show | jobs
by jwngr 3035 days ago
I think this is going to be extremely computationally intensive. One of the big performance wins I got when designing the search algorithm[1] was visiting as few nodes in the graph as possible (which I did via a bi-directional breadth-first search). To find the most distant node, I'd need to traverse the entire graph, which consists of almost 6 million nodes. It can be done, but it would take minutes, hours, days, ...

[1] https://github.com/jwngr/sdow/blob/a2699dc95d884ec64a4641630...

1 comments

Not too expensive, you can use parallel IDDFS to get a good approximation quickly. (Especially if you pick a good heuristic to follow links.)

Challenging part is keeping track of already visited pages to break cycles - some variant of a Bloom filter will help.