Hacker News new | ask | show | jobs
by NegativeK 1060 days ago
Wikipedia is a barest, insignificant smidge of the knowledge that's available on the internet. Yes, it's incredibly useful, but no, you can't stay within its bounds of knowledge for any appreciable time if you're actually trying to accomplish something beyond falling into a Wikipedia hole.

For manually curated content, Google and other search companies were solving that in 1998 because manual curation wasn't feasible for the amount of content on the internet. It's been 25 years since then, and we're not exactly producing less content online.

2 comments

To further support my argument in favour of manual curation, I would like to ask: why, for the past few years, have people ‘in the know’ been appending “Reddit” to their search strings on Google? Because Reddit is the largest repository of manually curated links, discussions, and original non-encyclopedic information on the internet.

People want manually curated information. They don’t want automation which by its very nature, as a fixed set of parameters, can and will be gamed.

I feel like Reddit is a bad example for an argument against search, because it still relies heavily on search. People aren't finding a specific subreddit for their query and then browsing it until they find what they're looking for, and they're usually not even searching directly on Reddit (because of the poor search quality), they're searching on Google.

What does "manual curation" mean in this context? I think it really just means "absence of spam/low quality content", which manual curation is not strictly necessary for.

Yahoo, back in the days before Google, was a manually curated search engine. They had a team of people doing the indexing work and creating the database from which the search engine pulled results. In principle, this avoids SEO spam because the manual curators aren't going to add spam sites to the index. Reddit functions similarly because, in principle, users aren't going to upvote spam posts.
Ah. But we're still using some algorithm that's doing curation to even look at reddit. I see you're point about reddit being manually curated via upvotes, interaction, and moderation, but there still has to be something in between to even find the content and match it with what we're looking for. Even reddit's too big to just be a page of links, a la Yahoo.

Reddit has avoided

Because they want a bias-filtered selection of the internet. It is very obvious looking at the reddit comments that most are made by bots and not humans, the difference is that most behave similarly and the upvoting favors sameness.
That’s not at all obvious to me. What subreddits do you visit? The ones I frequented had maybe 5% or less bot comments. I think mods generally banned the bots from those subreddits too.
It's less about Wikipedia itself and more about its citations. Is Wikipedia sufficient to find nearly any important[0] information if you follow hyperlinks, or if you also look up physical references? What are the characteristics of the information that's missing? How long can you avoid Google for if you go along the chain of citations and links?

[0] objectively quantifying importance is an entirely different can of worms, but people know what's important to them