|
|
|
|
|
by anentropic
1138 days ago
|
|
the naming is confusing... these models are aiming to equal or beat LLaMa by reproducing the trainign data and methodology that was used for LLaMa But the actual model architecture is slightly different, based on Pythia I guess what is needed is a pythia.cpp https://github.com/ggerganov/llama.cpp/issues/742 |
|