| So the question is compression ratio... So lets assume that amazons cia/nsa op datacenter is receiving these 1PB a day messages. Lets assume they use a tool to remove certain very common words and an algorithm whic can replace the hidden words later if they want to restore the messages. Text compresses really well. And if you have a replacement dictionary, even better. The most commonly used words: https://en.m.wikipedia.org/wiki/Most_common_words_in_English So you have a simple key-value store.... And compress that text even further. So what can we get a PB down to if we use that thinking... —- Hey has anyone noticed that palantir Has stopped their employees from wearing their swag on bart and disappeared from reddit and HN? Oh and all the nlp competitors went silent?? Yeah, deep state data processing market is booming. |