https://huggingface.co/alpindale/goliath-120b?text=Hi.
> An auto-regressive causal LM created by combining 2x finetuned Llama-2 70B into one.
It really is better (at reasoning) than the 70b models when I use it. Though some people reported that it makes spelling mistakes.
P.S. This doesn't always work out well, people have tried swapping different layers randomly and it makes the models incoherent.
It really is better (at reasoning) than the 70b models when I use it. Though some people reported that it makes spelling mistakes.
P.S. This doesn't always work out well, people have tried swapping different layers randomly and it makes the models incoherent.