I can even do that better in Lisp over C structs. In my own home-grown dialect. Suppose we have struct foo {
int count;
char name[16]; /* not null terminated! */
};
REPL: define an alias name foo for the type with typedef: 1> (typedef foo (struct foo (count int) (name (array 16 char))))
#<ffi-type (struct foo (count int) (name (array 16 char)))>
Now put an instance of the Lisp struct into a binary buffer using this FFI type: 2> (ffi-put #S(foo count 42 name "ABCDABCDABCDABCD") (ffi foo))
#b'2a00000041424344 4142434441424344 41424344'
Now, recover a new Lisp struct instance from this binary struct: 3> (ffi-get *2 (ffi foo))
#S(foo count 42 name "ABCDABCDABCDABCD")
No problem; the FFI type system knows that an "array of char" is different from a null terminated string, and can make it correspond to a Lisp string in both directions.Now, for fun, let's poke a zero byte into that buffer: 4> (set [*2 8] 0)
0
5> *2
#b'2a00000041424344 0042434441424344 41424344'
There it is. Now decode: 6> (ffi-get *2 (ffi foo))
#S(foo count 42 name "ABCD\xDC00;BCDABCDABCD")
What's that? My UTF-8 decoder treats the 00 as an invalid byte, and maps it into the surrogate pair range U+DCXX.
The otherwise optional semicolon was output because the next character in the string is a hex digit.If that U+DC00 is encoded back, it will reproduce the null byte: 7> (ffi-put *6 (ffi foo))
#b'2a00000041424344 0042434441424344 41424344'
|