Hacker News new | ask | show | jobs
by tarmac 6318 days ago
You're still technically logging in by providing the copied cookie. I don't see any difference here.

Or say the title should be changed to "How to scrape your data from your sites that require login"

1 comments

Agreed, this more of a "how to scrape data from sites you can log into". It's not even a very useful example at that.

For anyone who wants to scrape sites that require login I'd recomend Python with Twill. That lets you do the whole thing with ease.

Twill is an option indeed, but this way you don't miss out on the javascript. You can take advantage of all of your browser's features.
But the article is suggesting just copying cookies from Firefox into a simple Java based scraper. That won't support Javascript either.

If you needed Javascript you could use one of the Firefox scripting bridges (Selenium or MozRepl).