Hacker News new | ask | show | jobs
by weinzierl 1035 days ago
This is just another fine-tuned LLaMA and Llama 2, like there are already some. I doubt that this will give seriously meaningful results for long context inference.

32k context length sounds nice of course, and it seems to be common to call the just fine-tuned models like that. I think it is more of a marketing thing and we really should distinguish between the context length of the pre-trained model and the fine-tuned model, with the latter being the default meaning of context length.

1 comments

These 800 watt speakers are great. So loud.
“Better than lossless”