Hacker News new | ask | show | jobs
by simonw 210 days ago
This would be extremely useful. I think this is one of the most commercially valuable uses of these kinds of models, having more solid independent benchmarks would be great.