| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by simonw 603 days ago
	I wonder which of their models they use. Might even be Gemini 1.5 Flash 8B which is VERY quick. I just tried that out with the same prompt and it's fast, but not as fast as Cerebras: https://static.simonwillison.net/static/2024/gemini-flash-8b...

1 comments

londons_explore 603 days ago

I suspect it is its own model. Running it on 10B+ user queries per day you're gonna want to optimize everything you can about it - so you'd want something really optimized to the exact problem rather than using a general purpose model with careful prompting.

link