| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by solaxun 1756 days ago
	I'm not sure how the size of a standard library has any bearing on the size and perceived complexity of the language. To me, it's about how many language features/keywords/constructs etc. one must keep in their head to effectively write code, including writing the standard library. I'd rather have a language with a very small core and an extensive set of libraries implemented with that core than one with a large core that tries to handle everything with features. There's something to be said about comprehensibility of libraries written in a language with a small and focused core, as well.

1 comments

lifthrasiir 1756 days ago

> To me, it's about how many language features/keywords/constructs etc. one must keep in their head to effectively write code, including writing the standard library.

This is only a partial measure. Imagine that you are working with a string. You must keep the basic properties of strings in your head: Unicode string, byte string, byte string with defined encodings, byte string that decodes as UTF-8 by default, null-terminated, C ABI compatibility, length-limited, can or cannot contain lone surrogate code points, mutable or immutable, ownership, thread safety, copy-on-write, tree-structured (e.g. ropes), locale dependent or independent, grapheme clusters and so on. These properties are not a part of language proper but still something that occupies your consciousness and definitely relates to the complexity. And even more so if you want to do something with strings (we call them idioms, which are very important parts of the language that people normally doesn't perceive so).

link

solaxun 1755 days ago

> You must keep the basic properties of strings in your head.

Actually I just call whatever the languages version of len, rest, first, strip, split etc. is and move on. Snark aside, when I'm using a language I don't keep the implementation details of it's data structures in my head, I just use the provided API. I think the representation of data structurs is a different discussion.

Maybe a more appropriate analogue would be how many features were used to implement a string library, rather than focusing on the details of how a string is represented in memory.

Do I need to be aware of 6 different potential ways to sequentially navigate the string? Is there a way to do it using a loop, iterator protocol, destructuring, pattern matching, coroutines, special string indexing syntax, etc? Or can I just use a simple, uniform consistent interface and build the library on top of that?

link

lifthrasiir 1755 days ago

> [...] when I'm using a language I don't keep the implementation details of it's data structures in my head, I just use the provided API. I think the representation of data structurs is a different discussion.

The exact details of data structures used do not matter, but their implications should still be in your head. Depending on the implementation you may need a separate type for string builder, or can append efficiently only to the end, or can append efficiently to both ends but not in the middle, or can append or insert an arbitrary string at any position but everything takes log(n) time by default.

> Do I need to be aware of 6 different potential ways to sequentially navigate the string? Is there a way to do it using a loop, iterator protocol, destructuring, pattern matching, coroutines, special string indexing syntax, etc? Or can I just use a simple, uniform consistent interface and build the library on top of that?

There is nothing like a "simple, uniform consistent" interface for strings. Strings are conceptually a free monoid^W^W an array of string units with the following tendencies:

- The "string units" can be anything from bytes to UCS-2/UTF-16 units to code points (or Unicode scalar values if you don't like surrogate pairs) to grapheme clusters (whatever they are) to words to lines. Even worse, a single string may have to be accessible in multiple such units.

- Many common desired operations can be efficiently described as a linear scan across string units. There is a reason that regular expression exists for strings but not for general arrays. (Regex-like tools for arrays would be still useful, but less so than strings.)

- A slicing operation is very common and resulting slices generally do not have to be mutated (even though the original string itself can be mutable), suggesting an effective optimization.

As such there are multiple trade-offs in string interfaces across languages and there is hardly the single best answer.

link