Hacker News new | ask | show | jobs
by grosbisou 4566 days ago
Extremely interesting. But I cannot quite understand why RSTRING_EMBED_LEN_MAX is calculated that way.

VALUE seems to be unsigned int defined via "typedef uintptr_t VALUE;" and "typedef unsigned __int64 uintptr_t;"

But why is it calculated like that I don't get. Anyone can explain?

3 comments

The small string buffer should be the same size as the "heap" struct so as not to waste memory -- remember, they shared the memory as they're members of a union. The heap struct contains three members which, taking into accoult alignment restrictions, usually add up to three times the machine word size (which is basically what sizeof(uintptr_t) is). The "-1" is because C strings are null-terminated, so the maximum length is one less than the size of the buffer.

What I don't know is why they don't simply use sizeof(heap) as the buffer size.

Ah that was obvious. Thanks, very clear answer.
It's using the storage in an RString struct that isn't otherwise occupied by the RBasic info:

https://github.com/ruby/ruby/blob/8f77cfb308061ff49de0a47e82...

Note the `as` union. The `heap` version has three VALUE-sized entries, so RSTRING_EMBED_LEN_MAX is calculated accordingly, with the -1 to account for the null terminator.

Good question. In a really roundabout way it manages to be the same size as the alternative struct.

Edit: I missed that part of that was another union, removed what I said about it being off on 32 bit.

I still don't understand why they go so roundabout by dividing by one and casting to int...

Actually in C and C++, longs are 32 bit on most 32-bit platforms. If you need a 64-bit integer type, you need either "long long" or some implementation-specific equivalent.
I know that, I misread it as using two longs plus two pointers, fixed it now.