Safely Dropping MySQL Tables

Y	Hacker News new \| ask \| show \| jobs

	Safely Dropping MySQL Tables (planetscale.com)
	64 points by mscccc 1425 days ago

9 comments

skilled 1425 days ago

Would love to get some detailed, in-depth, and technically rich answers from people who upvoted this nonsensical article. It's literally a 100-word advertisement for a paid product.

link

solardev 1425 days ago

I dunno. As a web dev, I find these "latest feature from vendor X" (typically Cloudflare or Vercel) posts quite helpful.

They let me know what the current big companies are doing to improve developer experience and/or devops. Their teams are often bigger than my entire company, so learning from them is a great way to see what we might be able to implement (whether by buying their service or doing a small scale version of it in-house).

Secondarily, they sometimes also show end-user experience improvements that I never thought would be possible at a certain scale/cost. Cloudflare hosting Doom and IRC on serverless is one such example, or Vercel doing multiplayer live editing (Figma style) out of the box with Next.js is another.

For this case of this particular post, the idea of having an external metadata analytics system for Mysql, stored in a separate store and dashboard, is something I've only vaguely seen in DigitalOcean. Seeing it as fully featured as in Planetscale makes me both wonder if we should consider using them instead of a some generic cloud DB, and also whether we can implement something similar on a standard LEMP container. It's one of the more interesting pieces I read on HN today, actually.

For those of us who don't live and breathe infrastructure, big(ger) tech company blog posts is a great way to learn, and I value them!

link

skilled 1424 days ago

I'm not sure why you felt like you need to perform mental gymnastics over my comment. I said what I said because it's a valid point.

Let me help you understand it a little bit more in-depth.

First of all, the title used for this submission does not imply an announcement of a new feature. This one[0], however, does. Now, let's look past that. Let's say that the submission warrants front-page treatment fair and square because it has valuable content (a write-up at the very least). Lo and behold - it does not.

This submission[1] announcing Vercel's Build Output API has both a decent submission title (I can instantly tell it's a feature announcement) and fairly decent explanation for the feature, also. In fact, they even provided links to various documentations to help developers get started right away.

I'm still eagerly awaiting for someone to tell me what they learned from this specific submission.

[0]: https://news.ycombinator.com/item?id=15364896

[1]: https://news.ycombinator.com/item?id=32192498

link

solardev 1424 days ago

I don't know what mental gymnastics you're talking about? I thought I was literally answering your question... I didn't know that rich metadata analytics for mysql existed in an easy to use GUI from this or any vendor. That's valuable knowledge I just learned from the post. My Cloudflare and Vercel examples were similar.

Now, if your issue was with the article title, I fully agree. I wish the titles here were clearer.

link

daenz 1425 days ago

I didn't upvote it, but I think the product does inspire discussion about why we don't have basic important features like mandatory "last accessed" metadata on external state sources. Think of the number of hours of pain due to someone being "sure" a database/api/resource wasn't used, and then removing it. I'm not really put off by the advert for that reason.

link

throwusawayus 1425 days ago

tracking access time (every read) is huge performance bottleneck. especially if reliably persisting this

same reason why filesystems are often mounted with noatime attribute !

link

daenz 1425 days ago

Yes, the sensible default for this feature would be opt-out, like noatime, in case you need the performance boost and understand the implications of the choice.

link

samlambert 1425 days ago

This feature has zero performance impact on your database. It is powered by insights which is served from a separate data store.

link

solardev 1425 days ago

I think they meant if you were to add this as a default metadata feature into mainline mysql

link

tobyjsullivan 1425 days ago

The most generous hypothesis would be that Planetscale is an increasing popular product among the HN crowd - specifically, those who have to maintain large MySQL deployments. It seems like a very good product that is seeing something like viral adoption across those companies.

The next ingredient would be if dropping MySQL tables is a relatively big pain (or known but unmanaged risk) that most of those people have experienced at some point.

Then this feature would be exciting news for that not-so-small cohort and that seems sufficient for HN upvotes?

I don't know if that's what happened today. I've heard about planetscale a few times but never used it. I can attest that dropping tables is always scary.

link

skilled 1425 days ago

I know of many great products that a lot of HN users use or have used in the past. Not once have I felt the need to submit their announcement for a new product feature because I feel like it would spark discussion or foster genuine inspiration.

Don’t get me wrong but clicking on this link I was expecting an actual detailed guide/explanation.

There is nothing anyone can say to justify this being on the front page considering both the submission title and the actual content of the said submission.

link

ksec 1424 days ago

By your standard every single Apple product announcement should not be on HN's front page.

link

daenz 1425 days ago

>There is nothing anyone can say to justify this being on the front page

There's a "hide" button on HN posts if downvoting them isn't enough for you.

link

skilled 1424 days ago

There's a "flag" button also.

link

throwusawayus 1425 days ago

article title is misleading! it is about planetscale only, not mysql in general. borderline click bait

also posted by a former github person. not sure if poster is author or works at planetscale but a lot of ex github engg are there

link

samlambert 1425 days ago

It wouldn’t be a PlanetScale post without our biggest fan throwusawayus. Let me know if you want a hat.

link

itake 1425 days ago

Isn't the safe way to do this is to rename the table and watch for errors before performing the drop? Queries fail on the old table name and can easily be recovered by reverting the name change.

link

grogers 1425 days ago

If you happen to still be using it then you'd get a lot of errors right away though. If you are extremely paranoid you can do better by migrating to a new user that doesn't have access to the table. Switching to the new user would be a normal rolling deploy, so you'd only start with 1/N errors, and your normal rollback process gets automatically triggered.

Probably not worth the hassle though.

link

jjeaff 1425 days ago

Oh, that's clever. I have so rarely utilized the user management system in MySQL beyond just creating a user with the necessary permissions for each application. I hadn't ever thought of using new users to do canary deployments.

link

mathnode 1425 days ago

You are spot on. And more importantly, a roll back procedure.

In this case, RENAME TABLE x TO y, and watch or wait for a process to fail in test/QA/UAT/etc.

Have a DROP ready, but also a RENAME back.

A production quality MySQL or MariaDB installation should have enabled binary logging and maybe delayed replication to handle any issues.

link

samlambert 1425 days ago

With https://planetscale.com/features/rewind you can just drop the table and bring it back if you have errors.

link

robertlagrant 1425 days ago

The safest way is to migrate to Postgres and then do what you like with your MySQL instance.

(I just wanted to win the prize for the most HNey comment ever!)

link

SnowHill9902 1425 days ago

You are missing Rust somewhere.

link

itake 1425 days ago

OP's solution is objectively worse, because it only reports access recency. Just because the table was or wasn't used recently, doesn't mean its safe to drop.

The code change may of landed the day before that disabled accessing that table, but the tool would warn that the table was still in use.

A quarterly cron job may access the table and go undetected if you look purely at the recency.

The rename strategy protects the data and offers easy rollback. This is a false heuristic.

link

donatj 1425 days ago

To me this is one of the major values of monorepo. A quick grep of the entire ecosystem for a table name and bam, I can drop it.

link

wereHamster 1425 days ago

won't work if you dynamically concatenate table names from strings! "select * from " + "use" + "rs;"

link

hinkley 1425 days ago

Little Bobby Tables would like to have a word with you...

I honestly wonder if you could make a functioning database API that either does not accept strings as arguments, or can detect string concatenation and reject it. Not just a builder pattern to greatly discourage it, but a straight up exception on bad input type. Bind variables or GTFO.

link

hxtk 1425 days ago

The book "Building Secure and Reliable Systems" from Google's series on SRE actually talks about two examples of this in C++ and Go which forbid using anything but string literals in the query string of an SQL API.

In Go, the solution was very tidy: it aliases string to an unexported internal type that consumers cannot instantiate. String literals can be coerced to that type, but variables that already have type information associated with them are rejected at compile time.

The C++ solution was a bit more complicated and involved templates.

link

thrashh 1425 days ago

You can probably do it trivially in any language with operator overloading or a very expressive type system

However you would also probably want a language with some syntactic sugar that let users use your special string type easily otherwise the burden on users will be too high

link

hxtk 1425 days ago

> However you would also probably want a language with some syntactic sugar that let users use your special string type easily otherwise the burden on users will be too high

In the instance under discussion in that section of the book I'm referencing, allowing the user to instantiate the special string type was precisely the behavior intended to avoid. The idea being that in the following three examples, the first succeeds and the rest fail at compile time:

    Query("SELECT * FROM Users WHERE id = ?;")
    
    id := "6"; Query("SELECT * FROM Users WHERE id = " + id + ";")

    id := 6; Query(fmt.Sprintf("SELECT * FROM Users WHERE id = %d;", id))

link

Arnavion 1425 days ago

Sure you can. Just make your transport protocol only support taking in a stored procedure name and parameters for DMLs, and some typed representation for DDLs.

But while that prevents people from concatenating strings to form DML queries as a whole, it obviously doesn't prevent the kind of concatenation wereHamster mentioned.

link

Hamuko 1425 days ago

It also won't work if you worship Satan and let end users enter the table name dynamically, but who the fuck does this?

link

dotancohen 1425 days ago

I would argue the opposite, actually. The storage model or database layer should be in a single repo. And thus there would be less code to grep through.

link

hinkley 1425 days ago

That doesn't work because that repo needs to expose an API that accesses that table, now you have to prove the API is never called, and you're back to searching the entire code base for that function call (hopefully you've used meaningful function names instead of DRY ones...)

link

willcipriano 1425 days ago

Add: log.info("get_user_by_user_id has been deprecated, please use the alternative API.")

Into the repo method, push up the change, redeploy the projects that depend on the library and wait a while and grep the logs to see if it ever gets used.

Monorepo is easier though.

link

hinkley 1425 days ago

Dude, that just tells you if the table is used on the happy path. It does not tell you if it's only used by a customer whose godson is the CEO's favorite kid. Or if customers only use it right before a federal deadline for filing forms, or at the end of the quarter when it's time to generate reports. Which you didn't collect for 2 months.

link

dotancohen 1424 days ago

  > Or if customers only use it right before a federal deadline for filing
  > forms, or at the end of the quarter when it's time to generate reports.

You're right. How would you suggest to defend against these situations?

I personally like to version my APIs, but that is robust insofar as code changes, not for database changes.

link

mnd999 1425 days ago

So grepping the query logs then.

link

Scarbutt 1425 days ago

That's too obvious.

link

siddontang 1425 days ago

Interesting feature, I think we can learn from this in our product.

IMO, it is still not safe even we know there are no queries running for this table. You may still meet a scenario that when you type `drop table`, another guy begins to run the query at the same time.

As the maintainer of another database, We have been trying our best to improve this scenario too.

Early on, we have provided a feature called `recover table` to recover table immediately after you wrongly drop the table. But this still can't avoid affecting current running queries on this table.

We call this problem `DDL affects DML`, and now we try to introduce a Table meta lock to guarantee that no any query is running when the DDL executed. We hope we can release this feature in the end of this year.

link

daenz 1425 days ago

Every source of state should have this feature. It is always inevitably needed.

link

rawoke083600 1425 days ago

How does this relate to pt-tools ? percona mysql toolkit ?

link

noasaservice 1425 days ago

This is about dropping objects in the MySQL database.

It is not about getting rid of MySQL as a database.

link

jacquesm 1425 days ago

This is where the value of documentation comes in, as well as some process around your data model. If you need this to keep you safe you are doing something terribly wrong somewhere.

link