Hacker News new | ask | show | jobs
by imrehg 1428 days ago
I was looking at the source code linked from dashboard, and surprised a bit that it's a gist, rather than a repository. Is it just to facilitate discussions on the code? Otherwise it feels a bit ad hoc that way, wouldn't consider a gist with this many files more ergonomic compared to a repository.

Also, trying to follow to the data source of the Ercot website https://www.ercot.com/ I get blocked with things like this:

    Access Denied
    Error 16
    This request was blocked by the security rules.
    If you believe you have a valid business reason for accessing ERCOT resources, please contact the ERCOT ServiceDesk at ServiceDesk@ercot.com.
This seems ... pretty old school as well. Is it geofencing, or more general IP whitelisting? Feels very strange, guessing the former, as Google's cache returns a page that can be viewed. I don't see anything that would _really_ be drain on those ERCOT resources...
4 comments

> This seems ... pretty old school as well.

This doesn't surprise me. One of my projects as an intern was to generate reports from public data from ERCOT and a few other North American ISOs (organizations that control a region's power grid). Some of them are happy to reply to a basic cURL request. Others are more particular about things like your user agent or cookies set (even sites that didn't require authentication would set a cookie on first load and would reject requests to certain pages without that cookie). A few were very particular and ran some Javascript code to serve requests that made it so something like Selenium was required.

It was a bit surprising to me how many hoops we had to jump through considering this is public data that tax payers are entitled to.

If I recall correctly PJM was the best to work with because they had a well-designed REST API and provided a developer guide with sample code.

> I don't see anything that would _really_ be drain on those ERCOT resources...

I'm not sure if these controls are in place to safeguard against abuse. Most people don't even know these companies exist and I doubt they get too many requests. I'm assuming this is just a symptom of software sold to the lowest bidder.

> Otherwise it feels a bit ad hoc that way, wouldn't consider a gist with this many files more ergonomic compared to a repository.

All gists are actually GitHub repositories — you can clone them, branch, and commit changes. There isn’t PRs or an issue tracker, though.

(Same with GitHub wikis.)

Yeah, from my EU IP address I could not visit the site, with US-provided IP from my VPN I could. I don’t understand why they would geofence it honestly but that’s how it is.
They probably got DDoS like usage patterns from badly behaved scrapers around the world. Geo fencing is the easiest solution while still keeping data public domestically.

It's easy to blame them, but if you provide any data on the web, the load you get from scrapers is insane.

I think they block anything outside the US.
They also seem to allow Canada .. and Sydney, at least out of all of the AWS regions.