Hacker News new | ask | show | jobs
by jedberg 1984 days ago
> Reddit content ranks really well on Google

The most public contribution I made to reddit's codebase when I was there was the SEO features. I did all the usual stuff like cleaning up title and meta tags, and adding a sitemap. But the change that had the largest effect, by far, was adding the title of the story into the URL. As soon as we launched that, our Google traffic shot up.

The way you know you have truly mastered SEO is when Google takes away your control of the crawl rate on your SEO control panel. Soon after that we had lunch a special set of servers that just serve requests for Google, because they were killing us by crawling years of old posts.

3 comments

I never understood why google gives juice to this kind of URLs. It makes URLs longer and is barely useful anymroe but google keeps demanding it.
Because the URL is naturally short so you have to be very selective about what you put in it. So if the URL is "on topic" the page likely is. Just like if the domain is "on topic" it is very likely the site is, because it is short and hard to change.

IIUC some of the biggest factors that Google uses for the page itself (network effects obviously play a huge part) are domain, url then title. If you notice these are fairly space limited and user visible which means that it is harder for the website author to spam these with possibly relevant keywords.

Think they wanted to bias against: "it's at http://zs9l.com/860d9fg%fids0a4?249F" and other URLs with alphanumerics since normal people speak them out.
Because these urls are effectively "locked" to the content and the website can't play a switcheroo on site visitors (and google), maybe?
actually the content often changes (usually news articles being updated with different content / title). Google's idea was that it's easier to tell what a URL is about by looking at it, but in the mobile era i don't think it matters anymore.
Reddit archives (making them read only) threads after a year I believe. Why would Google re-visit archived threads? Did you see Google's crawler throttle crawl rate by response times?
Six months, and not back then. Back then you could comment on old threads. I wasn't there when they implement the thread lock, but I suspect it was related.

The reddit codebase is designed for recency. Interacting with old threads really trashed the databases, at least back then.

I'm not certain, but doesn't a 'read only' reddit page still get changed when a user who was in it deletes their account and posts? Often when a search engine sends me to reddit, the comments that probably contained the relevant information are long gone.
You might be right, in which case Google should offer a link to the corresponding Internet Archive wayback page (where the deleted content should still exist).

Failing that, there are browser extensions for both Chrome and Firefox that enable this functionality.

I used IA's browser plugin that does this for a while, but unfortunately had to stop because it had too many false positives (sending me to archive pages for sites that were still live.) An extension that does this properly definitely interests me.
They have it in their cache themselves so they even wouldn't have to link to IA.
Fascinating stuff! Any other simple tweaks that made a surprisingly big difference?
Off the top of my head I can't think of anything that was nearly as effective.