Hacker News new | ask | show | jobs
by hmottestad 804 days ago
Samantha, llama 2 pubmed, marcoroni, openchat, fashiongpt, falcon 180B, deepseek llm chat, orca 2, orca 2 alpac uncersored, meditron, tigerbot, mixtral instruct, wizardcoder, gemma, nouse hermes 2 solar, yarn solar 64k, nouse hermes 2 yi, nous hermes 2 mixtral, nouse hermes llama 2, starcode2, hermes 2 pro mistral, norskgpt mistral and norskgpt llama.

Nouse Hermes 2 Solar is the best model for Norwegian that I've tried so far. It's much better than NorskGPT Mistral/Llama. I actually got it to make fairly decent summaries of news articles, though it wouldn't follow any stricter commands like producing 5 keywords in a json list. Kept producing more than 5 keywords and if I doubled down on the restriction on the number of keywords it would start messing up the json.

The best competitor to GPT-4 was falcon 180b, it's still terrible compared to GPT-4. Mixtral is my new favourite though, it's faster than falcon and in general as good or better. Though I would still pick GPT-4 over Mixtral any day of the week, it's leagues ahead of Mixtral.

Tigerbot has a very interesting trait. It tends to disagree when you try to convince it that it's wrong.

I haven't been able to test out the new 8x22 mixtral or command r plus. These are the next ones on my list!

1 comments

Just tested out Command R+ with some niche SHACL constraint questions and it performs considerably worse than GTP-4. Might be a bit better than GPT-3.5 though, which is actually pretty amazing.
You need to use their beginning and end token scheme and set rep pen to 1 to get good quality out of cr+.