|
|
|
|
|
by rawrmaan
1138 days ago
|
|
There was a lot of detail and data in here, but it's not very useful to me because all of the comparisons are to things I have no experience with. There's really only one thing I care about: How does this compare to GPT-4? I have no use for models that aren't at that level. Even though this almost definitely isn't at that level, it's hard to know how close or far it is from the data presented. |
|
The big story here for me is that the difference in training set is what makes the difference in quality. There is no secret sauce, the open source architectures do well, provided you give them a large and diverse enough training set. That would mean it is just a matter of pooling resources to train really capable open source models. That makes what RedPajama is doing, compiling the best open dataset, very important for the future of high quality open source LLM’s.
If you want to play around with this yourself you can install oobabooga and figure out what model fits your hardware from the locallama reddit wiki. The llama.cpp 7B and 13B models can be run on CPU if you have enough RAM. I’ve had lots of fun talking to 7B and 13B alpaca and vicuna models running locally.
https://www.reddit.com/r/LocalLLaMA/wiki/models/