| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bravura 539 days ago
	antirez, it's probably identical to the approach in this paper: Li et al 2024, "Evaluating Large Language Models for Generalization and Robustness via Data Compression" (https://ar5iv.labs.arxiv.org/html//2402.00861). There's a pretty straight line from assigning probabilities (to a sequence of tokens) to arithmetic compression as an optimal compression algorithm for that distribution.