Hacker News new | ask | show | jobs
by throwusawayus 1492 days ago
tz setup is done with executing a single script. not surprising they could fix quickly. bigger surprise is they forgot to do this before youre support request. generally this is table stakes for managed DB

this is fourth day in a row of planetscale ads^H^H^H blog posts being on hn front page. as i mentioned on yesterdays thread, innodb_rows_read is known to be buggy. regardless, by design it includes cached rows. terrible thing to base billing on. real cloud providers base it on i/o instead since this is more reasonable metric of "use"

planetscale's fork of mysql-server adds only a single commit, which exposes rows_read in an extra place. this from company that keeps talking about "building a database" https://github.com/planetscale/mysql-server

3 comments

Installing the time zone tables on a single instance is certainly not hard: https://dev.mysql.com/doc/refman/8.0/en/time-zone-support.ht...

The trickier part is orchestrating the ongoing management of that across a large dynamic fleet. And in this case, it was much more than simply loading the tables but about using them to support importing databases into PlanetScale: https://github.com/vitessio/vitess/pull/10102

I'll link to my other comment on the billing issue: https://news.ycombinator.com/item?id=31509240

We've had to do some other changes to our MySQL fork as well that will show up there, but we'd love to not have any patches! We'd love to keep the patch set minimal (just as Amazon certainly does with RDS and Aurora). And I would certainly argue that Vitess, which is what we build PlanetScale around, is a meaningful piece of technology that pairs with MySQL to make a great database: https://vitess.io. You're of course free to disagree — and I wish you all the best as you work to build something great in the future.

what other managed sql DBs charge based on rows read, regardless of whether they are on-disk or in-memory? honest question. i am familiar with a number of managed mysql and postgres products, and none of them bill this way that i have ever seen

and for the record, despite planetscale staffers repeatedly denigrating rds (your competitor) on hn, aurora’s patch set is not “minimal”

i do think vitess is cool for what its worth. i just think your managed db product has bananas billing and also is horrendously over hyped, and your ceo’s responses to criticism are very reminiscant of theranos or wework’s responses to same

I doubt that anyone would claim their billing metrics are perfect. If you find some specific workload that's actually cheaper on another serverless database offering then we'd love to hear about it (we strive for transparent, generous pricing). If you don't think that CPU usage based pricing — which is typical for serverless offerings and e.g. is what Aurora serverless uses in Aurora Capacity Units (ACUs) — is charging you for reads of cached data then I've got some bad news for you. :-) You're almost certainly being charged for reading the "row" from the network, write-ahead-logging for it and other ACID/MVCC related overhead, writing it to block device, reading it from the block device, reading it from memory, writing it to memory, sorting and comparing [pieces of] it, and writing it back to the network — all of these things take CPU cycles. I find this argument to be entirely missing the point.

Pointing out that surely Amazon would like to keep their patch set to a minimum (there's a high cost in maintaining custom patches as you upgrade MySQL) is in no way implying that their patch set is small. Minimal means the minimum required for what you need, rather than being some point of pride.

I'm certainly not on here bashing any other offerings. Between the two of us, I only see one person trolling / bashing. :-) With that, I will leave you to your opinions which you are of course free to have. Best of luck.

aurora serverless pricing is not based on cpu cycles. this is just not how ACUs actually work or scale or are priced, at all man

anyway i gather the answer to my question is that no, there are no other examples of managed sql dbs that bill the way you do. my complaint is this is inherently not transparent because it violates user expectations. users try comparing to io based provders and fail to understand the pricing math comparison (on io pricing 1 read = many rows) or caching implications (on io pricing, cached rows dont count as io)

as for denigrating rds, look to your ceos past hn comments. i would link to it, but last time i did that i got flagged, despite it being a recent thread that i was directly participating in

It's fairly difficult to find actual details on ACUs and how it all works, the best I found after spending significant time looking was things like: https://www.jeremydaly.com/aurora-serverless-the-good-the-ba...

According to AWS you're paying for chunks of CPU and memory on a per second basis: https://aws.amazon.com/rds/aurora/faqs/

It's hard to imagine that the CPU capacity is measured in anything other than CPU cycles (time slices of physical capacity) — in the same way it's hard to imagine that the memory capacity is measured in anything but bytes. But whatever, I don't care. It's cool, good for them. The point was... you don't think you're paying for reads of records that are cached? I give up, I fail to see how this can really be a good faith discussion.

I don't know how all other serverless database offerings do pricing. What difference does it make? They're all different. As a user, you want it to be based on your usage and to be fairly and reasonably priced while also being easily audited and predictable. Those are the key properties I would care about.

I honestly cannot see how you could be missing the point by this much and still be operating in good faith so I'll for real, for real stop. :-)

you just are not understanding my point, that does not mean i am acting in bad faith! jeez

i originally said pricing for other managed sql dbs, not specifically “serverless” ones. we both know that is just a marketing term anyway

with ACUs the point is you configure min and max, and your cluster scales up/down based on a cpu utilization threshold. so, sure reading from memory uses cpu cycles — but a large cached read is incredibly unlikely to bump you over a scaling threshold which affects your bill, unless you’re doing some huge heavy sort operations

another key point is aurora serverless v2 does not scale down to 0 acu. you are always paying a predictable small amount for your base cpu and ram. minor increases in cpu usage literally do not impact your bill at all, which is why i do not believe your argument makes sense regarding cached reads.

edit to add: the reason this matters for monetary cost of ELT/ETL is it often involves very large reads. if your jobs only extract recent/changed data, this will very likely be in buffer pool, and cost way less with io pricing than with your row based pricing. clear?

You think about us way more than is healthy.
this is your response to valid criticism of your pricing model and functionality? as ceo? really?
We love feedback and criticism. You show up on all of our threads and take it way too far. I promise you, nothing you say to us will throw us off our vision and mission. You will only make yourself angrier and less happy by yelling at us on this website.
if you think purpose of my comments is to “throw us off our vision and mission” you are mistaken
"real cloud providers" most definitely charge based on rows read/written. Many startups / side projects choose the on-demand billing model because they don't want a fixed $x / mo when they don't need it. Some of them also have pre-provisioned options, and it seems likely that Planetscale will probably end up doing something similar.

https://aws.amazon.com/dynamodb/pricing/

https://firebase.google.com/docs/firestore/pricing

https://cloud.google.com/bigquery/pricing#on_demand_pricing

i was talking about managed sql databases

your first two examples are nosql. third example charges by data size processed, not by rows!

rows is weird metric since some tables have tiny rows, some have huge

The point is that the market has shown there is a huge appetite for alternative database billing models other than a fixed cost per month. From my limited personal interactions, I'm aware of 10-20 developers who are using Planetscale in large part because of their billing model. They would have never considered a SQL database before because of the fixed cost.

Those nosql options (probably the most popular in the world) also have the issue that row sizes are different, and if you're super cost conscious, you can change your architecture to take advantage of it. For example with Planetscale, you could store a lot more in JSON columns instead of other tables to reduce costs if that was your primary objective.

Is your frustration that you'd like to use Planetscale or a managed Vitess, but you are worried about locking yourself into a pricing model that you don't think will work for you?