Hacker News new | ask | show | jobs
by ambrop7 4755 days ago
Intrusive data structures are indeed more powerful. I made my own instrusive AVL-tree which can be found here [1] and an example here [2]. There's also the extra feature that the concept of a "link" is abstracted, so you can for example build a compressed AVL-tree inside an array using array indices instead of pointers, which don't break when the array is reallocated. It's also built in a different way than this usually done (macros), that is, a header file is included which redefines some identifiers, and the actual code of the functions is not inside macros, hence, easier to read.

About strings, I firmly believe that C's zero-terminated strings should be avoided and only used when necessary to communicate with existing code. They're really not simple and fast. You can't take a null-terminated string and extract a null-terminated sub-string without modifying the original - which is very often needed during various kinds of parsing. Also, null-terminated strings are unable to represent the zero byte, so in general, for example, you can't use them to store the contents of an arbitrary file, and if you do use them for purposes where null bytes can appear, you have to be careful and handle errors where nulls would implicitly truncate your string.

Just use (pointer, length) strings instead. It'll save you a whole bunch of trouble you may not see coming.

[1] https://code.google.com/p/badvpn/source/browse/#svn%2Ftrunk%... (CAvl_)

[2] https://code.google.com/p/badvpn/source/browse/#svn%2Ftrunk%... (cavl_test_)

1 comments

I don't see anything inherently broken about null-terminated strings. It means that the smallest string that can be represented is a single byte, and the largest string that can be represented is unlimited. By mixing in an integer, you're expanding the smallest string and placing an arbitrary limit on the largest string. If the integer and string data are mixed in a struct, then strings pack much less tightly, not just because of the space taken up by the integer, but also because it needs to be aligned. Most of the things you'd want to do with a string are O(n) anyway, like print to the terminal. Null-terminated strings aren't perfect for every use, but no representation is, and C doesn't prevent you from using a different representation when it's more appropriate. If you're writing a parser, then that's a more demanding application than usual, and you ought not feel locked into using the default representation.