Hacker News new | ask | show | jobs
by dreyfan 1685 days ago
Don’t use DB providers that charge for rows/data scanned. Use Amazon RDS or Google Cloud SQL or just install it yourself on a VM. Pay for CPU, memory, and storage instead.
2 comments

Rent metal and run your own MySQL/Postgres/...

One insert every 3 seconds. Could run that off a 10 year old laptop.

Sure, but we're addressing people who are so far on the other end of the spectrum they're using a "serverless" database where they pay for the number of rows scanned per query. I think a managed DB is a better middle-ground for their capability level while still delivering massive cost-savings.

Amazon RDS lowest-tier runs about $13/mo for 10GB storage, 2 vCPUs and 1GB memory with automated backups and push-button restoring. And that would have likely met all of their needs with capacity to spare.

The time spent setting it up and managing it and then having to deal with backups/environment clones/access control/scaling limitations/etc. outweighs the savings for almost any company paying US wages. Especially since you'd need metal for everything and not just the db due to network latency.
I think you're over estimating how complicated that stuff is...
It's very easy to do it in a half-assed way and much harder to do it at scale in a production environment with many developers without hurting developer productivity at all.
Every cloud-hosted startup I've consulted for had a full-time devops guy wrangling Terraform and YAML files. The cloud requires an equivalent time investment.
Bare metal requires the equivalent of all of that devops stuff and then more. That is if you actually want parity and not just a half assed version that hurts developer productivity and causes technical debt.
They clearly don't have the skills for that. And at one insert every 3s, a managed service like RDS won't really cost much.
... until you forget to create an index, apparently.
Rows returned model works really well for certain data loads (where all data customers use is customer-keyed)....

This model also scales DOWN really well .. while still providing good scalable availability...

That said, I DO agree with the sentiment of paying for a set performance level (clu, memory, storage), to provide predictable pricing.. obviously these guys were bit by the scaling capability.

I do a lot of pet projects, and I find DynamoDB works really well because my pet projects cost $0 most months... And I don't have to worry about servers, maintenance, or what not... I'm happy to do that at work, but I don't want that for my friends & fun projects... And I've not seen a decent DB managed RDS for <$5/month

Disclosure: I used to work on Google Cloud.

This is why BigQuery offers both models and lets you control the caps [1].

Buying fixed compute is effectively buying a throughput cap. Hard Quotas provide a similar function, but aren't a useful budgeting tool if you can't set them yourself.

"Serverless" without limits is basically "infinite throughput, infinite budget" (though App Engine had quotas since day 1 and then budgets once charging was added). The default quotas give you some of that budget / throughput capping, but again if you can't lower them they might not help you.

Either way, BQ won't drop ingestion or storage though because almost nobody wants their data deleted. As a provider, implementing strict budgets is impossible without having a fairly complex policy "if over $X/second stop all activity, oh except let me still do admin work, like adding indexes? Over $Y/second delete everything". I think having user adjustable quotas and throughput caps per "dimension" makes more sense but it puts the burden on the user and no provider offers good enough user control over quota.

tl;dr: true budgets are hard to do, but every provider should strive to offer better quota/throughput controls.

[1] https://cloud.google.com/bigquery/pricing

[2] https://cloud.google.com/bigquery/docs/reservations-workload...