Umbra: A Disk-Based System with In-Memory Performance
https://db.in.tum.de/~freitag/papers/p29-neumann-cidr20.pdf
Section 3.1 covers string handling.
This article (also linked from tfa) explains German strings in more detail.
https://cedardb.com/blog/german_strings
- two 64-bits words representation
- fixed, 32 bits length
- short strings (<12 bytes) are stored in-place
- long strings store a 4 byte prefix in-place + pointer to the rest
- two bits are used as flags in the pointer to further optimize some use-cases
- two 64-bits words representation
- fixed, 32 bits length
- short strings (<12 bytes) are stored in-place
- long strings store a 4 byte prefix in-place + pointer to the rest
- two bits are used as flags in the pointer to further optimize some use-cases