Do you store full copies of all data? What I mean is, if someone breaks into Greplin, can they effectively read all of my email assuming I've synced with Gmail? Or do you just index the data and reference sources using URLs?
Even if they just have an index without the full copy it's not that hard to reconstruct a version similar to the original just from the index, as in an inverted index you typically do not only store the documents of a term, but also the word number within the document. However, it's not possible to restore the original version exactly, due to things like stemmers.