Hacker News new | ask | show | jobs
by zeeshana07x 83 days ago
The gap between how this is described in the paper vs the blog post is pretty wide. Would be nice to see more accessible writing from research teams — not everyone reading is a ML engineer
2 comments

These are very different media types with very different goals.
Agreed. The practical implications are often more interesting than the math anyway — smaller models running locally means you can afford to run multiple models in parallel for cross-validation, which changes how you approach tasks like code analysis or bug detection.