| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by danidiaz 3778 days ago

> Strings The strings types are mature, but unwieldy to work with in practice. It’s best to just make peace with the fact that in literally every module we’ll have boilerplate just to do simple manipulation and IO.

Conversion is part of the hassle, the other part is not having common functions (like, say, "splitPrefix") that will work across all string-like types.

For this, I recommend the monoid-subclasses package which, among other goodies, offers the TextualMonoid typeclass, that has instances for many string-like types.

http://hackage.haskell.org/package/monoid-subclasses-0.4.2/d...

1 comments

mike_hearn 3777 days ago

Is there any good article on how Haskell got into a situation where there are multiple different string types? I understand how it happened with C/C++ (e.g. on Windows), but Haskell is much more modern than that.

I feel there must be some sort of story or interesting thing to learn here. Or is it just the usual str vs widestr type problems?

Also was interested to see the comment about huge records and the memory pressure it can cause. Seems like that's really an issue with immutability. I was expecting the author to provide some sort of advice or workaround for it, but apparently not.

link

michael_007 3777 days ago

I don't know an article, but the overall situation is pretty straightforward to understand.

The original Haskell strings were linked lists of characters. This was simple and elegant and worked well with the functional programming approach of the time (1980s, by the way, so maybe Haskell in its origins isn't quite so modern as you think). Nobody was much concerned about high performance string operations in Haskell at the time.

Inevitably, later people wanted to add more performant string types. But should they be lazy or strict? And do you want an abstract representation of Unicode, or do you want something more immediately suitable for arbitrary binary data? Enter four more string types. And now here we are with 5 string types in common use.

link

mike_hearn 3776 days ago

Thanks.

By "should the types be lazy or strict", you mean, if you do a string operation like uppercase/lowercase/substring/replace/etc, is that operation lazy or strict? Or are you meaning decoding bytes to characters or ... what aspect of the type itself can be lazy or strict?

link

michael_007 3776 days ago

Yes, the issue is whether the various string operations are lazy or strict. But whether it's possible to implement lazy operations does depend on the type itself. If the type is implemented as just an array of bytes in memory, it would be impossible to do anything lazily, because there's nowhere to store thunks (unevaluated values), only data.

link

mcguire 3777 days ago

I don't know about such an article, but Haskell's basic string is a linked list of characters. Nobody likes that.

link