Hacker News new | ask | show | jobs
by woah 4083 days ago
It's a headless browser. Node is a server framework that runs js.
1 comments

Which means that it can handle scripts that rely on browser features like DOM whereas node can't handle that. So, you can use it to test the client-side code automatically (with a CI tool of course). You can't do that in node if your code accesses DOM (e.g. calls something like document.getElementById or uses jQuery).

Btw, it is just a CLI wrapper around PhantomJS[1] I guess.

[1]: https://github.com/ariya/phantomjs

Another (popular?) use is for SPA or JS-heavy pages to render themselves for search engines. Crawl the site with PhantomJS. For each page, give it a few seconds to do XHR and render things, then save the HTML snapshot. When you get a request from Googlebot, serve the HTML snapshot instead of the app, which Google apparently still cannot handle. Ta-da!

(And piss off people who hate JS "apps" or still desire the HTTP/HTML document ideal, for better or for worse.)

It's worth noting, that googlebot does seem to do page renders with JS, though not as frequently and usually several days behind a non-js detected change. Bingbot definitely does, and this can even be seen in google analytics oddly enough (google doesn't seem to do any filtering for non-browser rendering).

Came to a lot of this knowledge when changing a url structure for a few hundred thousand pages (with permanent redirects in place)... the bing bot results on analytics were really surprising, and had to adjust filtering.

Interesting; I should play with it. I got the impression they "sorta" do JS, but wouldn't run a bunch of XHR and so on
this reminds me of the time they removed the vulcan cannon during the start of vietnam war because they thought the days of dog fighting was over because of missiles.

Then their jets started falling left and right and they ended up installing it back again.

I don't feel that's a proper analogy. We save a ton of time by only doing a single layout/rendering system. If a page pulls in 5+ different assets to render, doing it on the server means we gotta come up with a lot more logic to get it done on the client side. And what better way than to just run a browser to get it all done. It's like the ultimate server-side renderer framework.