Hacker News new | ask | show | jobs
by lloydatkinson 1605 days ago
I had the same thoughts and then it was confirmed how insane this setup is part way through:

“At the time of writing, the Citus distributed database cluster adopted by the team on Azure is HA-enabled for high availability and has 12 worker nodes with a combined total of 192 vCores, ~1.5 TB of memory, and 24 TB of storage. (The Citus coordinator node has 64 vCores, 256 GB of memory, and 1 TB of storage.)”

That’s beyond overkill for something that as you say could be generated statically a couple of times a day.

6 comments

It's probably overkill, but not really enough overkill to be worth spending much time on.

E.g. 12 worker nodes and 192 vCores means they've picked 16 core nodes. 1.5TB of memory across 12 nodes means 128GB per node. 24TB of storage is just 2TB per node.

So it's 12 relatively mid sized servers/VMs.

They could certainly do it with much less, and I have no interest in looking up what 12 nodes of that spec would cost on Azure, but at Hetzner it'd cost less than 1500 GBP/month including substantial egress. At most cloud providers the bandwidth bill for this likely swamps the instance cost, and the developer cost to develop this is likely many times the lifetime projected hosting cost even with that much overkill.

If they happen to have someone familiar with query caching and CDNs, I'm sure they could cut it significantly very quickly, and even an entirely average developer could figure out how to trim that significantly over time. But even at (low) UK government contract rates it's not worth much time to try to trim a bill like that much vs. just picking whatever the developers who worked on it preferred.

> generated statically a couple of times a day.

That would require actual work instead of selling an overpriced generic solution.

Did you look at the 3 different (non-trivial) APIs they are offering on top of the dashboard? Though I have a hard time understanding why use PostgreSQL instead of ClickHouse, for example.
No I didn’t tbh, I didn’t read much further. Notice how one sentence says Postgres was chosen because it was somebody’s preference
You will always be faster with worse tools you know than with better tools you don't know.
True but why does it also need terabytes of storage and 12 worker nodes?
i imagine getting something up, quicklh waa a priority, rather than spending longer architecting amd optimising.
My suspicion is that since this has to do with COVID, there is no real limit on what the cost should really be.

As for using the setup for other things, that seems less likely given this expensive setup.

> could be generated statically a couple of times a day

Hell, let's do some partial evaluation: just bake the computed HTML into the source code and recompile that a few times a day. No need to even read from a file when you can fetch it from rodata.

As for the reason why they did it this way, I assume it's a combination of CV-driven development along with the hackernoon-reading-junior-engineer-meets-cunning-salesperson effect which others have noted.

Yes the static render option seems optimal however if an API is being offered then something dynamic is mandated forcing scaling of the data tier. It seems like even a basic app cache would suffice.

Alternatively, we're building https://www.polyscale.ai/ that is a good fit for this type of use case. It's a global database cache and integrates with Postgres/MySQL etc. We host PoP's globally so the database reads are offset and local to users.

Agree with the other comments in that this feels like a shiny use case to quote to other prospects, but all good :)