Hacker News new | ask | show | jobs
BTLM-3B-8K: 7B Performance in a 3B Parameter Model (cerebras.net)
3 points by jwan584 1064 days ago