Hacker News new | ask | show | jobs
by adamtj 4018 days ago
> but may be (much) smaller.

The suffix array is already very small. It doesn't store the full suffixes, just the numbers. It's basically a list of pointers with one pointer for each character in the input data. The characters aren't duplicated, unlike a suffix tree. In fact, the suffix array is just the leaf nodes of a suffix tree in depth-first order. The internal nodes are omitted. Further, you can store it as an array instead of a linked list, meaning you halve the number of pointers to store.

1 comments

I mean that a suffix DAG would be smaller than a full suffix tree, sorry that that was confusing.

I am aware that a suffix array is smaller than a suffix DAG.