| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tenacious_tuna 1052 days ago

I agree with the premise that most people don't know how to identity or visibly complain about a given technical problem, and so an HN thread with N anecdotes about the problem likely corresponds to N * F actual amount of real-world incidents, for some value of F > 1.... but claiming it's a factor of a million without any backing evidence is absolutely an overreach.

> An open web is open for everyone/thing not just classes of beings you select. Bots and users can both be malicious and both can be positive.

This I agree with. I run an archiver ~monthly on a subset of my month's browsing history, and I'd hate if that got me blacklisted from Cloudflare-backed sites for a benign purpose. (See also the idea of remote attestation)

1 comments

computerfriend 1052 days ago

That's a pretty good idea. Do you randomly sample, or just exclude some domains? Is there some tool out there that does it for you?

tenacious_tuna 1051 days ago

Assembling the list of links to archive is a manual process--I just log them in an Obsidian notebook with a category and summary, and I later post it to my blog. (I don't really think other people care, it's more for me to be able to find past things I've found interesting.)

For the archival process I use ArchiveBox[1] running as a container on my NAS; I just grep through the note for `http|https` and feed the resulting list to the archiver. For everything not-hackernews I set the depth to 1, but for HN threads I do 2 so I grab whatever people may have linked in the comments.

I think there's ways to hook into like, ALL Firefox history or saved posts on reddit, but that's way heavier than what I care for.

[1]: https://archivebox.io/

computerfriend 1051 days ago

Interesting! Firefox history is just SQLite. I might do something like, take all non-search URLs and archive them once a month or so. Thanks for the inspiration.