| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gliptic 577 days ago

This could be circumvented by _training_ the LLM on the fly on the previously observed file data. This is what Bellard's other NN compressor, nncp, does [1], which is currently #1 on Mahoney's benchmark [2]. Unfortunately this is too slow, especially running on the CPU as Hutter's challenge stipulates IIRC.

[1] https://bellard.org/nncp/

[2] http://mattmahoney.net/dc/text.html

1 comments

lifthrasiir 576 days ago

In fact, pretty much every adaptive compression algorithm does. The eventual compression ratio would thus be determined by the algorithm (nncp, cmix, ...; also includes smaller tweaks like those typically made by the Hutter Prize winners) and its hyperparameters.

link

gliptic 576 days ago

Yes, the only exception is dictionaries used in preprocessing, but I think that's mostly a tradeoff to reduce the runtime.

link