Hacker News new | ask | show | jobs
by jstimpfle 1075 days ago
> Well alright maybe a string is more like a big int or big decimal than a long, but all of these are commonly used enough that they should be part of just about any language.

There is no way to fit what you describe in C. A string needs storage and lifetime management -- not only do you have to create new strings, you also have to delete strings that become unreferenced. There is no way to wrap a nice syntax around this in C to just make temporary strings that get automatically cleaned up. You would have to introduce a dependency on a global heap allocator, and introduce reference counting or similar machinery, and C is simply not about doing that.

And with a more structured approach, that missing syntax doesn't hurt that much. It can feel good to know what lives where and how the storage is managed. If you don't like it, go look someplace else. But don't critique C for concentrating on more basic and essential abstractions.

> Unless there's a separate type I've never seen they're still just a char* that's made at compile time.

Compile time is what I said right? And it doesn't make a char-pointer but a fixed size char-array.

You can have what you want in C++ thanks to RAII, like std::string. Whether the result is worthwhile is another question.

> If done right, the VM/compiler should be smart enough to optimize these repetitive things according to best known practices

User inputs aren't performance sensitive at all. You have a human in front that's sending maybe a dozen Byte/s of data at peak. Any language can handle that.

For visual output you're sitting on top of a browser rendering engine that's highly optimized in C/C++/Rust etc. Billions of dollars have been put into it. It's still certainly possible to use the API (the DOM and CSS) in the wrong way to make it dog slow.

The efficiency of making modifications to the data model underneath is predicated on the selection of data structures. If those are wrong, it will be slow no matter the selection of language or VM/compiler.

The text buffer is certainly one central datastructure that has to be fast. https://code.visualstudio.com/blogs/2018/03/23/text-buffer-r.... One thing to try is finding the "string" in there. See also the "Why not C++" section.

1 comments

The fundamental type isn't what you're thinking of as "string" roughly C++ std::string

What's fundamental is the reference type, because that's the type you're going to use much more often, this is correctly a slice type, it refers to some bytes, I said above &[u8] is the least capability that's reasonable, and that's at last what std::string_view gives C++

This is one of the many fundamental choices Rust made that's much cleverer than it looks. The str type (some bytes which are promised to be UTF-8 text) is a language feature, a slight improvement on [u8] that's core to the language, however String is just a library type, albeit a very heavily optimized library type. A $1 micro controller might well have some use for &'static str, the immutable slice reference, e.g. to talk about some text baked into its firmware, but it doesn't have a heap allocator, it's not about to waste precious RAM on a dynamic allocator, and so it doesn't need String.

The talking point is a string type that can be used with some convenience. Let's say join multiple of them together at runtime with '+'. Or whatever, maybe just a function call. I was explaining why it doesn't work just like that.

The str type you mention is a nice feature, and in the future when I will have switched to Rust or whatever, I might use it to write my programs in 5 lines less.

While in C I have to use C's "slice" type: char x[] = "Hello". I know it's not quite as good since if I was to pass this around, I would have to make a pointer + length representation for this. If I needed it, it can be automatized from string literals: struct String(const char *buffer, size_t size); #define STRING(lit) (String){lit "", sizeof lit - 1}. Or char buffer[256]; my_api(buffer, sizeof buffer);

For the few situtations where string manipulation is required, it's just not a real problem.