|
|
|
|
|
by bravura
539 days ago
|
|
antirez, it's probably identical to the approach in this paper:
Li et al 2024, "Evaluating Large Language Models for Generalization and Robustness via Data Compression" (https://ar5iv.labs.arxiv.org/html//2402.00861). There's a pretty straight line from assigning probabilities (to a sequence of tokens) to arithmetic compression as an optimal compression algorithm for that distribution. |
|