Y
Hacker News
new
|
ask
|
show
|
jobs
by
obblekk
695 days ago
Worth noting this model has 50% more parameters than llama3. There are performance gains but some of the gains might be from using more compute rather than performance per unit compute.