Hacker News new | ask | show | jobs
by emadm 924 days ago
We included full training details for the base model on 4 trillion tokens including wandb etc

https://stability.wandb.io/stability-llm/stable-lm/reports/S...