2. Spend orders of magnitude (literally) more on compute to run the LLM on the data than any compression algorithm would ever take.