|
|
|
|
|
by alexhutcheson
1922 days ago
|
|
> There's also benefits to middle-operations, such as text editing (although text files are so small that inefficient operations aren't a big deal anymore). Applications operating on text would normally use a rope (aka cord) or gap buffer. A linked list would have horrible performance because random access within the text is important for most applications. > (I'm building up GBs of data: so saving redundant chunks like this really wins) You’re likely spending a huge amount of memory to store pointers between elements within the unshared chunks (8 or 16 extra bytes per element adds up in a hurry) so it may be worth investigating some “chunking” so that you only have to use pointers to link between chunks, which can be stored continuously. Alternatively, consider whether the index representation used by suffix arrays will work for your use-case: https://en.wikipedia.org/wiki/Suffix_array |
|
> You’re likely spending a huge amount of memory to store pointers between elements within the unshared chunks (8 or 16 extra bytes per element adds up in a hurry) so it may be worth investigating some “chunking” so that you only have to use pointers to link between chunks, which can be stored continuously.
Yeah, they're chunked or unrolled in practice. (Unrolled Linked List was one term I've seen. Chunking is also another term).
> Alternatively, consider whether the index representation used by suffix arrays will work for your use-case: https://en.wikipedia.org/wiki/Suffix_array
Haven't heard of these before. But I'll look into it.
EDIT: Its... kind of a complicated situation I'm in. I'm basically searching a game tree, and building a "linked list" for how the children relate to their parent. I need "breadcrumbs" to relate any position back to the root.
Not necessarily because I need the root, but because the root contains information that's relevant to all of its children. So a linked list of (Grandchild -> child -> root) is very natural for this application. I originally had an array and just copied the data to an array (for "more locality") to all the children. But this uses way more space.
And the depth of this tree is ~12+, exponentially increasing width as usual. You really don't want to copy the root's data to all of the children unnecessarily: even on a GPU, its far better to just pass a pointer and connect the children-to-parent relationships, and traverse the linked list.
Hooking "child.parent = parent" is very easy. And traversing the while(node != root) node = node.parent; loop is also easy as cake. There's only ~12 of these linked-list operations in practice (because the depth of the search tree only goes to ~12 deep or so), across literally billions of possibilities. The billions of possibilities lead to the ability to process the tree in parallel (billions of possibilities to search with "only" 16384 hardware threads suddenly makes the GPU seem small!)
Despite the linked-list in the algorithm, its clear to me that the GPU is the ideal architecture for processing all of these nodes in parallel.