Hacker News new | ask | show | jobs
by thebeardisred 912 days ago
Wild. if I'm reading this correctly it's effectively a sort of "zip" algorithm for both the inputs and outputs of a prompt based model. thus, it allows a user to compress their request down to the minimal token size which retains the same semantics. In effect, this then allows a user to encode a more dense set of tokens into the original request.

Does that sound about right?

3 comments

Yes you're correct -- it's a really interesting thing, in that it reminds me of early 2023 when people would "compress" prompts by having ChatGPT rewrite it itself into something smaller.

There's really no substantive difference between that and what they're doing here, other than they're purposefully using a crappier model than GPT 3.5/ChatGPT to increase the cost savings.

For example, the first set of graphics is demonstrating switching a long question with 5 Q/A examples ("5-shot", in the literature) into ~4 sentences that are a paraphrasing of the question and have one or two very brief examples without reasoning.

That's all well and fine if you're confident the model is so amazing that it answers as well as it does with 1-shot as it does with 5-shot, but it is very, very, very likely that is not the case. Additionally, now you're adding this odd layer between the user's input and OpenAI that will easily be "felt".

There is a need for a comparison, otherwise I find your assessment of the performance a "bit" subjective.
Please, by all means! I didn't mean to imply I have data or that you need to accept my comment as a scientific data-backed conclusion. :) I just have the lived experience of ~0 ML models performing better at 0-shot than 5-shot. That would be a good sign of AGI, in fact, now that I think about it...the model being able to workaround good instructions with bad examples.
Sounds right to me. I think it’s fun that is this may be the only compression algorithm where the output is still human understandable.

It reads like a slightly garbled version of what someone writing down bullet point notes of a lecture might write.

It’s so rare that the human optimized and machine optimized versions of an input are so similar

Is there a text file with many input/output pairs? I couldn't find it in the readme

The examples folder contain jupyter notebooks, there's also some videos and papers, while I just want to see an example text compressed

There’s some examples on the website: https://llmlingua.com/
In fact, it can be seen as semantic communication, which is defined by Shannon.