I should note that our linear layers are not the same as Microsoft's, in fact, we think Microsoft made a mistake in the code they uploaded. When I have time later today, I'll link to where I think they made a mistake.
I've been following TriLLM. They've achieved great results, and I'm really impressed with the llama.cpp contributors already getting the models integrated.
I've been following TriLLM. They've achieved great results, and I'm really impressed with the llama.cpp contributors already getting the models integrated.