Hacker News new | ask | show | jobs
by almost 6318 days ago
Agreed, this more of a "how to scrape data from sites you can log into". It's not even a very useful example at that.

For anyone who wants to scrape sites that require login I'd recomend Python with Twill. That lets you do the whole thing with ease.

1 comments

Twill is an option indeed, but this way you don't miss out on the javascript. You can take advantage of all of your browser's features.
But the article is suggesting just copying cookies from Firefox into a simple Java based scraper. That won't support Javascript either.

If you needed Javascript you could use one of the Firefox scripting bridges (Selenium or MozRepl).