Hacker News new | ask | show | jobs
Looking for fast GPT-4 level LLM usable via API. (not via OpenAI)
2 points by eerop 817 days ago
Hey, I'm looking for a fast LLM to use via API, that's as good or almost as good as GPT-4, better than GPT-3.5-turbo at least.

GPT-4 via OpenAI is too slow -- 100 token output takes >3 seconds.

Where can I find one?

2 comments

Use streaming ?
Not possible, unfortunately. The thing I'm building on top of doesn't make it possible. I need it all at once.
Claude Haiku.
Thanks. I just tried it, it's definitely fater, but still, sometimes it takes >3 seconds (my app requires the completion to be done in <3 seconds).

I've tried to optimize it by reducing token length and other methods, but I'm wondering if there's any better LLMs