| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by fbrncci 1144 days ago
	Those startups will move on to open source models because OpenAI api calls with 32k token contexts are way too expensive.

2 comments

cced 1144 days ago

What is the latest in conversational models that allow GPT3 like (or close) performance w.r.t running things locally?

link

noman-land 1144 days ago

Apparently Vicuna 13B is quite good according to Google's own leaked docs.

https://twitter.com/jelleprins/status/1654197282311491592

link

space_fountain 1144 days ago

That's according to this (https://lmsys.org/blog/2023-03-30-vicuna/) promotional blog post and just cited by the google memo right? Which isn't really even a doc, just a memo that was circulating inside google.

I also find it strange they don't contrast gpt4 and gpt3.5

link

int_19h 1144 days ago

This assessment is based largely on GPT-4 evaluation of the output. In actual use, Vicuna-13B isn't even as good as GPT-3.5, although I do have high hopes for 30B if and when they decide to make that available (or someone else trains it, since the dataset is out).

And don't forget that all the LLaMA-based models only have 2K context size. It's good enough for random chat, but you quickly bump into it for any sort of complicated task solving or writing code. Increasing this to 4K - like GPT-3.5 has - would require significantly more RAM for the same model size.

link

amelius 1144 days ago

Is there a way to always stay up to date with the latest and best performing models? Perhaps it's me but I find it difficult to navigate HuggingFace and find models sorted by benchmark.

link

noman-land 1144 days ago

Honestly, I just read hackernews :).

link

amelius 1144 days ago

HN posts are not always in chronological order.

link

noman-land 1144 days ago

I didn't say it was the best way, just the way I'm doing it right now :).

link

nickthegreek 1144 days ago

I check r/LocalLlama

link

modernpink 1144 days ago

GPT3 is dated so many open source models are competitive with it, but Vicuna 13b is supposed to be competitive with GPT4

link

speedgoose 1144 days ago

Against GPT3.5 perhaps the gaps aren’t too big for your use cases, but I wouldn’t say it’s in the GPT4 league. It looks close in the benchmarks but the difference in quality feels (to me) huge in practice. The others models are simply a lot worse.

link

modernpink 1144 days ago

Interesting. Have you tried StableVicuna?

link

speedgoose 1144 days ago

No, is it worth a try? I didn’t see a lot of hype about it so I didn’t try it.

link

raincole 1144 days ago

I don't think it's expensive at all. For things that don't need to be so correct (like, unfortunately, marketing blog posts) it's a <$1 per post generator, which is very cheap to me.

For things where correctness matters, the majority of cost will still come from humans who are in charge of ensuring correctness.

link

fbrncci 1144 days ago

Even if it was around 0.10$. This does not scale, it would need to be less than 0.01$ per generation to keep up with open source models where the cost effectively is 0$ (leaving our hardware). These open source models are still not replacing GPT4, but they are moving into that territory.

link

raincole 1143 days ago

Oh really. Then show me your "open source model" that handles 32k tokens on a consumer-grade PC. Actually don't show me, show the internet. You will be the most famous man in tech world.

link

fbrncci 1143 days ago

Well surely I can't convince you, feel free to build the next AI startup on OpenAI then, and stop caring about any possible competition out scaling you once token limits on open source models become more in line with the walled garden of Google, MS and OpenAI's high API pricing ;)

link

raincole 1143 days ago

My bet is open source models (true open source without string attached) won't ever catch up OpenAI etc. I'll be really surprised if there is one that can match GPT-4 in the next 2~3 years. If you tried LLaMA and StableLM you would probably feel the same.

link

danjc 1144 days ago

Use cases for individual people are ok but it's far too expensive to deploy into your SaaS where a large number of users will use it.

link