| HN Mirror

I haven't played with the model just yet - but just eye balling it's performance it's significantly worse. I'm surprised they don't have Pythia on there as that's what they're based on from my understanding.

At their performance level it's the most important to compare to GPT-neoX, and I do appreciate they aren't making the "95% of GPT4" claims that some fine-tuned llama models are.

EDIT: For databricks people: I'd love to see this compared with Pythia, LLaMa, Alpaca, and vicuna/gpt4all if possible.