In the event someone is encountering suffix trees for the first time and thinking of using one: the amount of RAM required for suffix trees is obscene.
I totally fell for the "obscene memory" trap myself. My first encounter with suffix trees outside of a textbook was for an ITA Software 'Instant Search' puzzle. The requirement was sub-0.1ms search on a large string database, I went straight for a generalized suffix tree. Then I realized they had asked for the solution to fit within a 1GB heap. :(
I've never grokkeed suffix trees, but isn't possible for them to be O(n) in space (n total length of all strings)? Is there just an unacceptable constant factor overhead? I can imagine the pointer overhead being painful.
I wrote up the full 'war story' of how I had to profile the heap and optimize the node representation (shaving bytes off the edge storage) just to get it to boot without an OOM error: https://www.abahgat.com/blog/the-programming-puzzle-that-got...
It’s the most tangible example I've run into of where theoretical O(n) space complexity meets the reality of object pointer overhead.