Hacker News new | ask | show | jobs
by danidiaz 3778 days ago
> Strings The strings types are mature, but unwieldy to work with in practice. It’s best to just make peace with the fact that in literally every module we’ll have boilerplate just to do simple manipulation and IO.

Conversion is part of the hassle, the other part is not having common functions (like, say, "splitPrefix") that will work across all string-like types.

For this, I recommend the monoid-subclasses package which, among other goodies, offers the TextualMonoid typeclass, that has instances for many string-like types.

http://hackage.haskell.org/package/monoid-subclasses-0.4.2/d...

1 comments

Is there any good article on how Haskell got into a situation where there are multiple different string types? I understand how it happened with C/C++ (e.g. on Windows), but Haskell is much more modern than that.

I feel there must be some sort of story or interesting thing to learn here. Or is it just the usual str vs widestr type problems?

Also was interested to see the comment about huge records and the memory pressure it can cause. Seems like that's really an issue with immutability. I was expecting the author to provide some sort of advice or workaround for it, but apparently not.

I don't know an article, but the overall situation is pretty straightforward to understand.

The original Haskell strings were linked lists of characters. This was simple and elegant and worked well with the functional programming approach of the time (1980s, by the way, so maybe Haskell in its origins isn't quite so modern as you think). Nobody was much concerned about high performance string operations in Haskell at the time.

Inevitably, later people wanted to add more performant string types. But should they be lazy or strict? And do you want an abstract representation of Unicode, or do you want something more immediately suitable for arbitrary binary data? Enter four more string types. And now here we are with 5 string types in common use.

Thanks.

By "should the types be lazy or strict", you mean, if you do a string operation like uppercase/lowercase/substring/replace/etc, is that operation lazy or strict? Or are you meaning decoding bytes to characters or ... what aspect of the type itself can be lazy or strict?

Yes, the issue is whether the various string operations are lazy or strict. But whether it's possible to implement lazy operations does depend on the type itself. If the type is implemented as just an array of bytes in memory, it would be impossible to do anything lazily, because there's nowhere to store thunks (unevaluated values), only data.
I don't know about such an article, but Haskell's basic string is a linked list of characters. Nobody likes that.