Hacker News new | ask | show | jobs
by _ea1k 2271 days ago
I think that is small-think. The technical solution is only part of the problem and scaling up all systems to meet the .1% case seldom makes sense. They were smart to save 5-10%.
2 comments

Eh.... On the flip side, processing and storing some simple text forms should be able to handle 1000s of simultaneous users on one box.

So, probably like most software of this nature, the reason it's not scaling is simply because the people who made it probably weren't the greatest engineers on the block.

These are the same kinds of assumptions that lead engineers to think they can build a [any product] clone in a weekend. It's unlikely that the problem or constraints are nearly as simple as one may think.

Consider: single auth across all the state's services, external APIs, identity verification, address verification, employer ID verification, federal/military ID verification, income/tax verification, phone verification, bank account information, translation into multiple languages, accessibility features, etc. Also, there's probably a lot of legacy infrastructure and process.

Also, if "ability to burst to 10x normal filings per week that might happen once every 40 years" wasn't in the spec, I think they were right not to engineer for it.

Admittedly it's a value call. My thought is generally if it's a small incremental cost that greatly increases the robustness then you should go for it. But - sometimes the money or time just isn't there. I'm bothered more by the people not even wanting to have the discussion than by those who do a summary analysis and decide it's not worth it.
That's a fair point. My comment comes from being in too many meetings where people want Twitter scale for conference-room-sized user bases.

It sometimes borders on sealioning.

The 0.1% case happens. And if it’s going to seriously wreck lives when it happens then you should solve for it. Does Instagram need to handle the 0.1% case? No. But the unemployment website should.
Unemployment forms being delayed by a day or two to deal with poor queuing will not "wreck lives".