Hacker News new | ask | show | jobs
by jeffbee 2217 days ago
string and bytes are the same on the wire but generate different application code. string is defined to be utf-8, so if you are planning to put arbitrary bytes in there, you can't use that type. in some languages string is checked for utf8 conformance, which is expensive. In some languages allocating strings is more expensive than allocating bytes.

Only use string when you really mean utf8.

1 comments

Isn't that still ok for keys ? You typically want keys to be human-readable and -writable strings anyway
I recently had this debate in a professional context. My correspondent argued that string was good for filenames, because it's a human-readable string. I pointed out all the different ways that a filename (on a variety of platforms!) can be constructed such that it will not conform to utf8. In a new system I would want to see an iron-clad reason for string in this case, because it's not obviously correct and the efficiency story isn't on the side of string, either.

Luckily you can change from string to bytes at any time, since the wire format is the same, so this isn't the kind of mistake that gets cast in stone.

Maybe your key is hash of something. Converting it to string is extra effort and it takes more space.