Hacker News new | ask | show | jobs
by costco 708 days ago
Wow this is pretty amazing... I imagine your proxy bill must be pretty huge! I always wondered how companies like Clearview scraped Instagram etc at scale. Do you add user to a queue, get all of that users posts, then add everyone they follow/everyone that is following them to the queue, and repeat? With Twitter I know from experience you can predict what the next snowflake IDs will be so in theory you could enumerate the whole site. If I recall correctly Tiktok has a similar ID scheme but I think people weren't able to figure out what some of the last bits represented.
1 comments

Thank you. Yes proxy bill is quite high!

Yes you are right, the ID enumeration method doesn't work with TikTok.