Hacker News new | ask | show | jobs
by 19h 1087 days ago
Unrelated to the actual site — we’ve been testing a tuned version of airoboros-64b on a H100 cluster as a drop in replacement for Claude-100k and GPT4-32k. It’s performing rather well in the generation of text but the 2k context definitely shows, also the reasoning capabilities just as in LLaMA are suboptimal.

For instance, obtaining JSON structured data from freetext is a rather impossible task.

That said, for summarization of leq 2k token texts the model performs extremely well.

This is mostly due to its unfiltered nature, where there would be clear biases visible in models like falcon-40b-instruct or other LLaMA derivatives.