I was using a physical RGB LED so there wasn't much extra bandwidth for conveying information. I live in a cold climate so green really meant "Leave now and you'll only have to wait in the cold at the stop for 2-3 minutes". Here's the code if anyone's interested[1]. Looks like my memory wasn't quite right, I did blue/green/red with blue=wait, green=leave, red=missed.
Much better than the MTA's website... though slightly less useful if you're not me commuting to work.
What I learned from this exercise is that:
1) The estimates are consistently inaccurate; downtown trains at Chambers St. always arrive when the clock says "2" (minutes until arrival).
2) They use some sort of distributed cache that doesn't remain consistent; as you bounce between backend instances you get different results, but often the same two results. (The red/green lines under the station names indicate freshness.)
3) The clocks in the stations don't work when it's too hot, but the actual data collection/processing is fine.
That's interesting. Worked on a service based on TFL (London) data, and they were incredibly picky about ensuring people would not get misled by signage. Of course sometimes there are tradeoffs, but they had extensive rules to ensure the tradeoffs minimised negative experiences. E.g we had to take any buses off the boards if the data was more than x seconds old, never ever allow enough clock drift or other issues to cause our displays to be off by more than a certain amount, clear the whole display if we got no api response within a certain amount of time etc.
It was very annoying to implement, but as a user it is very nice to know how much thought has gone into it.
[1] https://gist.github.com/kevana/32bfa486d9fb0aa20a19694d1b69d...