Hacker News new | ask | show | jobs
by suneilp 3903 days ago
PhantomJS allows you to render a page and fully manipulate or search it. It's a headless WebKit browser you can use from the command line and it works pretty well. Google is obviously doing the same thing. They even used to show images of what a url looks like in the search results. They stopped doing that as I suspect it uses up a lot of resources of many sites.
1 comments

I can say that Bing definitely does do JS interpretation as part of some of their renderings... I switched the URL routing in a relatively large site (about 300k routes, including navigatable search urls), so that they were all consistent, and all pointing to the new routes via permanent redirect... previously the project was supporting all of their older routing schemes over time, and it was troublesome wrt SEO (duplicated content on many pages, or the same because search parameters were the same, but different structure, same for individual content pages). When the change happened, we saw a huge uptick in google analytics hits (one page, no clickthroughs) coming from two locations... both turned out to be MS data centers. It was a relatively common problem.

It was always just a little white noise in the past, but when suddenly a couple hundred thousand pages permanently redirect... it was interesting.