Hacker News new | ask | show | jobs
by aDyslecticCrow 165 days ago
Interesting to see a deepdive about string formats. I hadn't thought very deeply about it before.

I do agree with the string imutable argument. Mutable and imutable strings have different usecases and design tradeoffs. They perhaps shouldn't be the same type at all.

The transient string is particularly brilliant. Ive worked with some low level networking code in c, and being able to create a string containing the "payload" by pointing directly to an offset in the raw circular packet buffer is very clean. (the alternative is juggling offsets, or doing excessive memcpy)

So beyond the database usecase it's a clever string format.

It would be nice to have an ISO or equivalent specification on it though.

2 comments

> The transient string is particularly brilliant. Ive worked with some low level networking code in c, and being able to create a string containing the "payload" by pointing directly to an offset in the raw circular packet buffer is very clean. (the alternative is juggling offsets, or doing excessive memcpy)

It's not anything special? That's just `string_view` (C++17). Java also used to do that as an optimisation (but because it was implicit and not trivial to notice it caused difficult do diagnose memory leaks, IIRC it was introduced in Java 1.4 and removed in 1.7).

> It's not anything special? That's just `string_view` (C++17)

Just because something already exists in some language doesn't make it less clever. It's not very widespread, and it's very powerful when applicable.

This format can handle "string views" with the same logic as "normal strings" without relying on interfaces or inheritance overhead.

it's clever.

> It's not very widespread

It is tho?

> and it's very powerful when applicable.

I don't believe I stated or even hinted otherwise?

> This format can handle "string views" with the same logic as "normal strings" without relying on interfaces or inheritance overhead.

"owned" and "borrowed" strings have different lifecycles and if you can't differentiate them easily it's very easy to misuse a borrowed string into an UAF (or as Java did into a memory leak). That is bad.

And because callees usually know whether they need a borrowed string, and they're essentially free, the utility of making them implicit is close to nil.

Which is why people have generally stopped doing that, and kept borrowed strings as a separate type. Without relying on interfaces or inheritance.

> it's clever.

The wrong type thereof. It's clever in the same way java 1.4's shared substring were clever, with worse consequences.

> "owned" and "borrowed" > java 1.4's

You're getting into pedantics about specific languages and their implementation. I never made a statement about C++ or java. I work in primarily in c99 myself.

> the utility of making them implicit is close to nil. > Without relying on interfaces or inheritance.

Implement a function that takes three strings without 3! permutations of that function either explicitly or implicitly created.

> You're getting into pedantics about specific languages

No, I'm using terms which clearly express what I'm talking about, and referring to actual historical experience with these concerns.

> Implement a function that takes three strings without 3! permutations of that function either explicitly or implicitly created.

In the overwhelming majority of cases this is a nonsensical requirement, if the function can take 3 borrowed strings you just implement a single function which takes 3 borrowed strings.

In the (rare) situation where optimising for maybe-owned makes sense, you use a wrapper type over "owned or borrowed". Which still needs no "interface or inheritance".

I never really put much thought into it either, until I started playing with Rust, which pretty much supports every common way to use strings out there. Mostly for compatibility sake, but still, it's wild all the same.