|
|
|
|
|
by bt1a
793 days ago
|
|
This won't dethrone Llama 3, but it's equally impressive. They mention this model's relative weakness in the TruthfulQA eval, since it's more lossy trying to pack 'knowledge' into a small model relative to problem-solving skills (which shine on MMLU) Regardless - still a very useful thing to have offline and on the fly. Those scores are nothing to scoff at. Given that these pipelines are likely harder harder to imitate than new architectures like Transformers, I assume there has been and will be an intense focus on synthetic data generation and cleansing. Llama 3 used 15T of tokens in its training corpus vs 4.8T in the "scaled-up" version of phi-3. If you made it to the end of this disjointed ramble I'm sorry |
|