| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by simonw 601 days ago

It turns out someone has written a plugin for my LLM CLI tool already: https://github.com/irthomasthomas/llm-cerebras

You need an API key - I got one from https://cloud.cerebras.ai/ but I'm not sure if there's a waiting list at the moment - then you can do this:

    pipx install llm # or brew install llm or uv tool install llm
    llm install llm-cerebras
    llm keys set cerebras
    # paste key here

Then you can run lightning fast prompts like this:

    llm -m cerebras-llama3.1-70b 'an epic tail of a walrus pirate'

Here's a video of that running, it's very speedy: https://static.simonwillison.net/static/2024/cerebras-is-fas...

2 comments

croes 601 days ago

It has a waiting list

link

londons_explore 601 days ago

The "AI overview" in google search seems to be a similar speed, and the resulting text of similar quality.

link

simonw 601 days ago

I wonder which of their models they use. Might even be Gemini 1.5 Flash 8B which is VERY quick.

I just tried that out with the same prompt and it's fast, but not as fast as Cerebras: https://static.simonwillison.net/static/2024/gemini-flash-8b...

link

londons_explore 601 days ago

I suspect it is its own model. Running it on 10B+ user queries per day you're gonna want to optimize everything you can about it - so you'd want something really optimized to the exact problem rather than using a general purpose model with careful prompting.

link