|
|
|
|
|
by jpcompartir
234 days ago
|
|
I can't remember which paper it's from, but isn't the variance in performance explained by # of tokens generated? i.e. more tokens generated tends towards better performance. Which isn't particularly amazing, as # of tokens generated is basically a synonym in this case for computation. We spend more computation, we tend towards better answers. |
|