Hacker News new | ask | show | jobs
by robbfitzsimmons 4038 days ago
This is an awesome attempt at a big problem. My financial data seems one of the most monetizable aspects of my life and I'm particularly reticent to give it to Intuit et al., but never seem to have much of a choice.

I'm actually curious why broader community effort hasn't sprouted up around web scraping for banks, given how horrible API support has traditionally been. Plaid (plaid.com) purports to make this easy, but it's not very mature yet and will be a paid service.

4 comments

For those interested, Wesabe open sourced a lot of their code: https://github.com/wesabe

There's even a method for running the service in a virtual machine: https://github.com/wesabe/mesabe/wiki

They also have some OFX tools.

The wisdom contained in those repos might be a good place for a community effort to start.

Weboob (http://weboob.org) has moderately active community. There's even a couple of finance tools built around it. Though web services it scrapes are mostly European.
[OFX](http://www.ofx.net/) already exists.
Is it legal to scrap data from a banks website, even if you actually scrap your own data?
Yes, scraping is legal in most cases, and if you do it yourself you don't give your login credentials to a third party either, which might be a liability issue.
From personal experience, I've tried scraping my own TD Ameritrade data and quickly got flagged by their system as a virus because my request pattern likely failed some heuristic check. They disabled my account "until I get that cleaned up" and I had to plead with multiple people over the phone just to get it re-enabled.

The overall experience was quite terrible, but I wouldn't be surprised if it's the norm.

Certainly can imagine this happens. Currently I'm scraping 4 banks and 4 shops on a daily basis, and the only issue was when scraping for the first time from new IP address. They asked a security question or displayed a captcha. I resolved this by logging in first time from the browser through SSH tunnel over that IP. All subsequent scraping went well since then.
Opposing personal experience - I've been scraping my bank data (scotia) for about a year now, haven't had any serious problems (there seems to be some kind of signin counter that thresholds and then forces secret questions to be reset, but that's it).
A concerted community effort could develop counter-heuristics.
If not, it's hard to see how Mint could still be alive.