Y
Hacker News
new
|
ask
|
show
|
jobs
by
grim_io
183 days ago
This seems like a good way to measure LLM improvement.
It matches the my personal feeling when using progressively better models over time.