Hacker News new | ask | show | jobs
by robocat 2255 days ago
AFAIK they just provide type name aliases, which do not enforce or warn of you if you mix the “types”.
1 comments

They have changed it

Now the string types have an encoding and the string themselves, too. When you assign a string to a string variable with a type of a different encoding, the string is automatically converted.

But it is causing a huge mess. Especially with existing code. When you have a library using utf-8 and one library using the default codepage, that is not valid anymore. Although you can manually override the encoding for each string, so any string might have any encoding regardless of its type.

Here is an example of the mess:

I have a benchmark of various maps in freepascal. The benchmark creates strings of random bytes to use as keys.

A classic key-value store is the sorted TStringList.

Now the benchmark of the TStringList fails. Apparently, because it now assumes the keys are valid utf-8 when using the utf-8 codepage as default codepage.

The default codepage can be changed. When I start the benchmark with LANG=C .. it works with the random byte keys. On Windows, the default codepage is usually latin1, so it would work there, too.