| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by goostavos 4982 days ago

In general, if you're going the mechanize route, .retrieve() is the function your looking for.

e.g.

  br = mechanize.Browser()
  br.retrieve("https://www.google.com/images/srpr/logo3w.png, google_logo.png)[0]

Mechanize doesn't really have a proper doc, but just about everything you'd need could be figured out from the very lengthy examples page on their site.

1 comments

bdcravens 4982 days ago

Playing with it now, and while it seems to hit my download need, I can't seem to get it to play nice with sites that are JavaScript dependent. Am I missing something, or is there a way to plugin an underlying WebKit engine?

link

bryogenic 4982 days ago

PhantomJS is capable of downloading binary content from js dependent sites but it is a journey to get it working as it is not an out-of-the-box feature. Instead use CasperJS to drive Phantom and get a ton of snazzy features including simple binary downloads. Happy scraping!

link