Hacker News new | ask | show | jobs
by WorldMaker 592 days ago
It amazes me how many DBAs think the limit on a varchar column impacts the disk space. The "on disk" size for `varchar(12)` and `varchar(32)` and `varchar(MAX)` are roughly the same and depends on the data itself more than the schema. That's what the "var" in "varchar" means: variable storage size. The limits like (32) were added for compatibility with `char` and for type-based "common sense" validation. Sure, it helps prevent footguns like accidental DDoS of ingesting too much data too quickly, but there are other ways to do that basic top-level validation of "is this too much data to insert?".

Five varchar(12) columns is more storage overhead than one varchar(60). There's a lot of great use cases for varchar(MAX) and everyone I ever had tell me that varchar(MAX) wasn't allowed didn't understand the internals of DB storage that they thought they did and somehow still believe in their internal model of the DB that varchar is just spicy char and fixed column size allocation.

1 comments

With Postgres, we mostly just use `text` everywhere, unless there is an actual reason to have a size limit.

In other news, I haven't seen a dedicated "DBA" at a company in over a decade.

> With Postgres, we mostly just use `text` everywhere, unless there is an actual reason to have a size limit.

Yeah, there's still the very rare need to performance engineer out a fixed char field "to the left" of the table to speed up common table scans, but also so many of the reasons you might table scan strings have moved into proper full text search indexes or now all the rage is in vector embeddings.

> In other news, I haven't seen a dedicated "DBA" at a company in over a decade.

Yeah, anecdotally from LinkedIn and other sources it does seem like all the dedicated DBAs that have stayed that way have stuck to very specific niches and/or Oracle Products (including MySQL and derivatives these days; the "Oracle Effect" is strong). Especially in Amazon RDS and Azure SQL Server/Cosmos DB today, Postgres and Microsoft's SQL Server mostly run themselves and day-to-day administration is minor/trivial.