Hacker News new | ask | show | jobs
by swyx 7 days ago
thanks - credit to silas, eric, ben, and team for the depth of the evals, and the rest of the research team for doing the transcript reading parties lol

by nature of being based on open source, frontiercode public will saturate very very quickly. frontiercode main will be >80% in less than a year. hopefully diamond will last a bit longer. we can do annual refreshes, thats not my strategy for staying relevant - what i'm more excited to get funding for is private held out version of frontiercode based on repros of real enterprise customer problems. in an ideal agent lab (https://latent.space/p/agent-labs) you meticulously build up this domain understanding and that is essentially why both model labs and serious customers come to you.

1 comments

Interesting. So frontiercode-IBM-Diamond is a thing you’d hope to sell the creation of and certification of? And if it’s published then you’d expect model providers to train to forntiercode-IBM-Pro or whatever and publish it so that it would be considered a good model to use inside IBM? (Obviously just a random corporate choice here).
no, single customer focus would be bad for a number of reasons.

but frontiercode-finance? thatd be cool…

Ahh interesting. There are companies making good money doing private equity dd work, largely custom harnesses. A lot of open space right now.