Hacker News new | ask | show | jobs
by krasin 448 days ago
> Is this some joke? They use Llama 2 7B? What year is it?

They use llama2 to demonstrate that their compression method works. There are potential cases:

1. The method works on all / most LLMs. In this case, it does not matter on which model they demonstrated the effect.

2. The method only works on llama2, but not on other models. Given that they published the code, I expect that people will quickly test the method on many other models, so we will know that soon. And yet - there would be a scientific significance even if it works only on llama2, as it would mean that there's some special and good in that architecture.

But I would bet it's #1 - the method works on most of the models and they just picked whatever they had already had code bindings to, to save the effort.