Hacker News new | ask | show | jobs
by simonw 603 days ago
I wonder which of their models they use. Might even be Gemini 1.5 Flash 8B which is VERY quick.

I just tried that out with the same prompt and it's fast, but not as fast as Cerebras: https://static.simonwillison.net/static/2024/gemini-flash-8b...

1 comments

I suspect it is its own model. Running it on 10B+ user queries per day you're gonna want to optimize everything you can about it - so you'd want something really optimized to the exact problem rather than using a general purpose model with careful prompting.