Hacker News new | ask | show | jobs
by cbdfghh 3471 days ago
Just curious, is a Go string a C string inside (rune array + nil) or a proper class array?

In other words, in

a:="b"+"c"

IOW, does it implement the Slemiel's painting algorithm or not?

2 comments

I'm not sure I 100% understand your question, but in Go a string is a two-word struct.

    type string struct {
        data unsafe.Pointer
        len int
    }
The C version would be

    typedef struct string {
        const *char data;
        int len;
    } string;
Edit: oh, ok. I Googled the "painting algorithm" reference. I swear I had read that before :-)
If you consider googling it too: http://wiki.c2.com/?ShlemielThePainter
>Just curious, is a Go string a C string inside (rune array + nil) or a proper class array?

I can't imagine how you would even implement a string as anything other than a rune array.

Even a Pascal string is just the string length followed by characters.

> I can't imagine how you would even implement a string as anything other than a rune array.

You can also implement strings using a tree data structure. We do this in an implementation of Ruby that I work on because it can make concatenation faster.

A text editor edits a (potentially very long) string, but apart from, possibly, the likes of Notepad, none of them store that string as a single byte array.

So, look at text editor data structures for other representations. Examples:

- https://en.wikipedia.org/wiki/Gap_buffer

- https://en.wikipedia.org/wiki/Rope_(computer_science)

- https://en.wikipedia.org/wiki/Piece_table

In Haskell strings are lists of characters (https://hackage.haskell.org/package/base-4.9.0.0/docs/Data-S...). I hear not everyone is thrilled with the performance implications. :)

Also not a Haskell programmer, and really not meaning to criticize. Haskell seems awesome and I should learn it some day.

Just so people know, there are arrays used for text manipulation in Haskell in the Data.Text library, and bytestrings in the ByteString library. The language as specified does indeed have strings in a linked list of numbers, but if you even remotely care about performance you don't use those, and most libraries don't either nowadays.
Thanks! I guess I was kind of hoping for someone with actual knowledge to fill in the details.
as C does: with a \00 end marker.
Yeah, and how do you store NUL chars in those C strings then?

The answer is you can't. Because prefix-length strings (Pascal) are far better than null-terminated (C) strings.