Hacker News new | ask | show | jobs
by scottlamb 1681 days ago
Good for you. But I think you're being uncharitable by failing to distinguish between "concept I didn't understand" and "thing I forgot to consider until I saw the problem it caused". The title also suggests the former, but I think the author is being a bit humble by underplaying his existing knowledge. Likely he actually did know what indexes are before; if you asked him to detail how MySQL foreign keys work he might have even remembered to say they add an implicit index. But it's super easy to miss that you're depending on a side effect like that until you see the slow query (or, in this case, high bill).

When you're programming, how many compiler errors do you see a day? (For me, easily dozens, likely hundreds.) Do you think each one indicates a serious gap in your knowledge?

Along these lines: imposter syndrome is a common problem in our industry. One way it can manifest is junior engineers can thinking they're bad programmers when they repeatedly see walls of compiler errors. I think it'd help a lot to show them a ballpark of how often senior engineers see the same thing. [1] I know that when I'm actively writing new code (especially in languages that deliberately produce errors at compile time rather than runtime), I see dozens and dozens of errors during a normal day. I don't think this is a sign I'm a bad programmer. I think it just means I'm moving fast and trusting the compiler to point out the problems it can find rather than wasting time and headspace on finding them myself. I pay more attention to potential errors that I know won't get caught automatically and particularly to ones that can have serious consequences.

I think the most important thing the author learned is that failing to add an index can cost this much money before you notice.

Ideally the author and/or the vendor will also brainstorm ways to make these errors obvious before the high bill. Load testing with realistic data is one way (though people talk about load testing a lot more than they actually do it). Another would be watching for abrupt changes in the operations the billing is based on.

[1] This is something I wish I'd done while at Google. They have the raw data for this with their cloud-based work trees (with FUSE) and cloud-based builds. I think the hardest part would be to classify when someone is actively developing new code, but it seems doable.

1 comments

No you've missed my point, the author seemingly didn't know that ForeignKeys applied indexes by default in MySql. It's not "Concept I didn't understand", clearly they're capable of understanding because they did after they ran into the issue. It's about not having had basic knowledge to begin with.

But he didn't see compiler errors, he caused monetary cost to his employer.

When I deploy something that unintentionally causes a large monetary bill to my employer, then yes I do believe that indicates a gap in knowledge so I don't in anyway believe I'm being uncharitable. Or and this would be worse, a lack of caring. (Which is not what I think happened here though)

I won't respond to your imposter syndrome bit I don't really think it's relevant to my point.

> When I deploy something that unintentionally causes a large monetary bill to my employer, then yes I do believe that indicates a gap in knowledge so I don't in anyway believe I'm being uncharitable.

It depends, if you've been given a loaded footgun it's not entirely your fault when it inevitably goes off.

Let's go back to your "compiler errors" scenario, and let's say someone decided that the company should be using a cloud-based compiler that happens to charge per error. I wouldn't blame developers for falling into a trap that challenges all known assumptions.

The problem is that there is a DB that charges insane amounts of money per row processed with no upper limit and that someone actually thought it was a good idea to use it.

>The problem is that there is a DB that charges insane amounts of money per row processed with no upper limit and that someone actually thought it was a good idea to use it.

That's it in a nutshell. Usually you have an upper bound on compute, memory, disk space or some other resource for a specific price. When you hit those limits, you find performance issues and at that point you can choose to try optimizing your code or database, then decide whether you need to upgrade resources at cost.

I really don't understand this model that charges for rows read or, worse, "inspected". What's the upside of that model versus more typical pricing schemes, and how is it manageable/predictable from a budget perspective? With or without the indexing problem here, you'd really have to know your user behavior, then translate that to DB read counts by your app. And, while devs should all be optimizing code as much as reasonable, something as specific as minimizing DB reads seems an odd constraint to place on software.

I'm guessing there must be some use case I'm missing; else I don't know why this pricing scheme is even a thing.

I am somewhat shocked to find that an RDBMS is considered a “loaded footgun” in 2021. Perhaps grandparent isn’t the most charitable in their interpretation, but I am in full agreement. It continues to astound me how little about the basics of databases most developers know, and how strongly resistant they are to trying to learn.
An RDBMS that scales infinitely while charging you per-row goes against the usual assumptions learned in the past decades, so I'd say yes that's a loaded footgun.
Have a friend who had a BigQuery query that ended up costing $3k each time. It ran for only a minute, because BigQuery chews through data really fast. But you don't realize that when you push the run query button. And there's no spend guard rails. They switched to pay for given amount of parallelism after that.
First its not my "compiler errors" scenario it's the person who initially replied to me. Sure whatever, I don't think I ever insinuated I thought that was a good idea, it runs in parallel with the issue I have.
I think you didn't read through to this part of my comment:

> I think the most important thing the author learned is that failing to add an index can cost this much money before you notice.

> Ideally the author and/or the vendor will also brainstorm ways to make these errors obvious before the high bill. Load testing with realistic data is one way (though people talk about load testing a lot more than they actually do it). Another would be watching for abrupt changes in the operations the billing is based on.

No I did, but since I disagree with your earlier point about how much existing knowledge they have it kind of by default means I disagree with what they took away from this incident.

It's also highly speculative so like I'm not going to go back and forth on it.

Needing a vendor to hand hold your likely highly paid dev seems like a bad fix to me.

Also not having an index isn't an error it can be a valid choice based on your situation and query load which is why people should know the situations when they're needed.

I think people should simply be better. A lot of people don't like hearing that though so usually I keep it to my private chats where people seem more willing to cop to that fact.

I know we disagree, I know you're going to continue disagreeing, I know I don't want to have the conversation.

> I know we disagree, I know you're going to continue disagreeing, I know I don't want to have the conversation.

Please consider not chiming in on the next article like this then. I think your attitude of (paraphrasing) "no good programmer would have made the costly mistake you shared, and articles about it aren't worthwhile" is super harmful to our industry. It's the polar opposite of the blameless postmortem approach I'm fond of.

This one in particular is not worthwhile on the front page of HN, that's my take. They're most definitely useful for beginners, or maybe people just learning about databases.

I'm not going to not post simply because you find it disagreeable, there are plenty of people here who seem to agree with me.

Blameless post mortems are great, for your team. I am not his team mate, and I don't really feel a kinship with every developer under the sun. And for what it's worth I don't blame this developer for anything. If anything I lament the institutions that failed them on the way to this point in time. To me this is a symptom of systemic rot.

Your submissions are nothing but "ask HN". Leech. To chide about "not knowing" but then yourself ask the community seems a bit hypocritical.