Hacker News new | ask | show | jobs
by masklinn 2800 days ago
> Which you probably want to do anyway because pascal-strings are simply better.

They're not though. While having an explicit length is great, p-strings means the length is the first item of the data buffer, which is just awful, and why Pascal was originally limited to 255 byte strings.

Rust or C++ use record-strings, where the string type is a "rich" stack-allocated structure of (*buffer, length[, capacity], …) rather than just a buffer/pointer.

2 comments

That is a fair point, I misunderstood the term to refer to any type of string where the length is stored explicitly. I'll try and refer to them by their correct name ('record strings') from now on :-)
> p-strings means the length is the first item of the data buffer, which is just awful

You can represent it as a struct of (length, char[]) which isn't awful.

> You can represent it as a struct of (length, char[]) which isn't awful.

It kinda is still: if you're storing it on the stack you're dealing with an unsized on-stack structure which is painful, and if you're not you're paying a deref for accessing the length which you don't need to. If by `char[]` you mean `char*` then it's a record string, not a p-string.

I mean a variable-length array, all stored together.

Presumably you'd allocate it on the heap in general. But a record string also requires a heap allocation.

Most of the time you're touching the length you're probably touching the string data too, so that dereference isn't going to cost very much. And it comes with a tradeoff of more compact local data. So I stand by it being not awful! It may not be perfect, but it's a solid option.

When you have record strings you get slicing for free though. Without the indirection of a pointer you have to copy data when you slice (or you must have a separate 'sliced string' type).
"free" if you ignore the cost of doing lifetime management. So beneficial in some use cases but not others.