Hacker News new | ask | show | jobs
by vanuatu 16 days ago
I don't think its much of an issue

- Rl envs + synthetic data + human annotated

- Usage data from codex/claude code/cursor

Most of the model abilities in coding come from post-training, not pretraining

1 comments

A better question is what's left for those who don't have access to that. We went from publicly available to vacuumed from private users
Open source models

unfortunately all the incentives right now are for repos to be private

Open source models are for rich people: only they can afford the hardware needed to run them.