Y
Hacker News
new
|
ask
|
show
|
jobs
by
ACCount37
67 days ago
Constantly. Minor revisions can easily "wobble" on benchmarks that the training didn't explicitly push them for.
Whether it's genuine loss of capability or just measurement noise is typically unclear.