| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bcjdjsndon 20 days ago
	Shame you stopped short of actually benchmarking that scale though, eh?

1 comments

gaeld 20 days ago

will do - we are a small team and it takes time to implement and optimize a new model, whatever the size.

link

lostmsu 20 days ago

You don't even need to train the model just to see if you can infer it at the claimed speed

link

gaeld 20 days ago

True, and for third-party models we'll just re-use their public open weights.

There is a time-consuming part, though, that is performed manually by our (human) team: implement the logic of the model in C++ and assembly code in a super-optimized way, co-designed for each specific hardware card.

This can take months.

We hope to accelerate the process with AI agents, but we're not there yet.

link

bcjdjsndon 20 days ago

link