Hacker News new | ask | show | jobs
by gst 5598 days ago
Even if they just have an index without the full copy it's not that hard to reconstruct a version similar to the original just from the index, as in an inverted index you typically do not only store the documents of a term, but also the word number within the document. However, it's not possible to restore the original version exactly, due to things like stemmers.
1 comments

often in text indexing the original document is kept for things like snippet generation.