| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by antirez 532 days ago
	Yep. For lossy what could work even better is an encoder-decoder model, so that it is possible to just save the embedding, and later the embedding will be turned back into the meaning.

1 comments

srush 532 days ago

I've tried to build sort of model several times, but could never get it to work. The challenge is that small perturbations in encoder space lead to removing semantically important details (e.g. dates). You really want these to mess up syntax instead to get something more analogous to a lossy video encoder.

link

nullc 532 days ago

I built a lossy text compressor in the days before LLMs.

I used a word embedding to convert the text to a space where similar tokens had similar semantic meaning, then I modified an ordinary LZ encoder to choose cheaper tokens if they were 'close enough' according to some tunable loss parameter.

It "worked", but was better at producing amusing outputs than any other purpose. Perhaps you wouldn't have considered that working!

In terms of a modern implementation using an LLM, I would think that I could improve the retention of details like that by adapting the loss parameter based on the flatness of the model. E.g. for a date the model may be confident that the figures are numbers but pretty uniform among the numbers. Though I bet those details you want to preserve have a lot of the document's actual entropy.

link

antirez 532 days ago

Yep, makes sense... Something like 20 years ago I experimented with encoder/decoder models for lossy images compression and it worked very well, but it's a completely different domain indeed, where there aren't single local concentration of entropy that messes with the whole result.

link

mikaraento 531 days ago

I guess text in images would be similar, and is indeed where image generation models struggle to get the details right.

E.g., making a greeting card with somebody's name spelled correctly.

link