| HN Mirror

1. Usually it's a multi-way tradeoff between how much data you want to use, how much compute you want to spend, how much time you have available, how much training data you have available and how accurate you want the embeddings to be.

2. Yes, but lossily. Some types of byte strings are such that it doesn't matter if you accidentally change a couple of bits, some types of byte strings cannot tolerate that at all without being hopelessly corrupted. This technique is not a magic card to surpass the limits imposed by information theory, it's "just" a more sophisticated dictionary for your compression algorithm.