|
|
|
|
|
by jhaywood
4596 days ago
|
|
But the semantics of CHAR are not what most people expect and almost never what you actually want. If you want an actual fixed length of non-blank data you need an additional check constraint to enforce the min-length. CHAR semantically represents fixed length text fields from old data file formats not this data always has n (non-blank) characters. If you do have different length then a VARCHAR is more appropriate. Also a lot of application frameworks that interact with the database only deal with VARCHAR so then as soon as you use a CHAR you have to start worrying about trimming your text data because one of the biggest database text type gotchas is accidentally trying to compare a VARCHAR and a CHAR improperly. While I can see good reasons to include length checks there is never a good reason to use a CHAR unless you're trying to interoperate with COBOL programs written in the 80's |
|
how do you know that? did you take a survey? I've been working with DBAs for years, in my world everyone knows that's how CHAR works. The padding behavior is nothing new, and is intuitive - the value must be N characters in length, so if you stored less, you get back a right-padded string. This is exactly what I'd expect.
> CHAR semantically represents fixed length text fields from old data file formats
and two or three letter character codes like country codes, state codes, etc. are what we use CHAR for, these are fixed length text fields. They are still in modern use today. Plus lots of us still have to write apps that actually read old files too - CHAR is appropriate for these as well, assuming you are storing fields that aren't right-padded in the source datafile (such as social security numbers, etc.).
Your app will of course work with a VARCHAR instead, but the point of CHAR is that it's self-documenting as to the type of data to be stored in the field - fixed length, as opposed to variable length.
> If you do have different length then a VARCHAR is more appropriate.
if you are storing variable length, then you should absolutely use VARCHAR. That's why it's called "VAR", it means, "variable".
> Also a lot of application frameworks that interact with the database only deal with VARCHAR so then as soon as you use a CHAR you have to start worrying about trimming your text data
If you are using CHAR correctly, you don't have to trim anything, because you are storing a string that is exactly the length of the CHAR type. I'm not familiar with how an application framework would only know how to deal with VARCHAR and not CHAR, database adapters return strings for both types. And if there were such a framework, I'd not be using it.
> one of the biggest database text type gotchas is accidentally trying to compare a VARCHAR and a CHAR improperly.
all of which stems from the same, singular mistake - don't store variable length data in a CHAR - plus if you are comparing VARCHAR to CHAR, that is also usually doing it wrong, as an adequately normalized database wouldn't be repurposing some kind of fixed length datatype out into a VARCHAR of some kind elsewhere.
The aforementioned CHECK constraint is a good way to enforce that if the developers/frameworks in question tend to be error-prone about this kind of thing (it's not an error I've had much issue with, since I know how CHAR behaves).
> While I can see good reasons to include length checks there is never a good reason to use a CHAR unless you're trying to interoperate with COBOL programs written in the 80's
as it turns out a vast portion of the world economy is supported by mainframes and old software that often spits out fixed length datafiles, there is even Python code in my current work project just written in the past six months which is tasked with parsing such files (they are actually quite simple to parse, since you just pull out each field based on an exact position). Not to mention that boring things like state codes, country codes, and the like are often fixed length fields.