| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by instance 1130 days ago
	I tested on a serious use case and quality was subpar. For real use cases I had to either host the most powerful model you can get (e.g. LLaMA-65B or so) on a cloud machine, which again costs too much (you'll be paying like 500-1000 USD per month), or just go straight for GPT-3.5 on OpenAI. The latter economically makes most sense.

2 comments

inferense 1130 days ago

what real use case did you use it for?

link

instance 1130 days ago

For instance used it in conjunction with llama-index for knowledge management. Created an index for a whole confluence/jira of a mid-sized company, got good results with GPT, but for LLaMA of this size that use case was too much.

link

dzhiurgis 1129 days ago

I'd argue 1k per month for mid-sized company is nothing, but I can understand where you are coming from.

link

sroussey 1129 days ago

Did you try instructor-xl? It ranks highest on huggingface.

link

throwaway1777 1130 days ago

Making demos to raise investment probably

link

raffraffraff 1130 days ago

What about turning the cloud vm off except when you're actually using it?

link

quickthrower2 1129 days ago

So modal.com is "turning-the-vm-off-when-unused-as-a-service" :-)

I ran research/open_llama_7b_preview_200bt on there, using they python example, with A10G gpu.

Cost 2-3c per run, taking ~20 seconds each time, on fairly small prompts. So about the same as GPT-4?

Now this is a non expert just playing, it probably can be optimized by trying different GPUs and optimizing the code somehow.

I don't think you are using these models to save money, but you might be using them for tunability, privacy, mobility [1], secrecy or fun/research.

[1] in other words you want to build a robot that can work disconnected from the internet.

link

unglaublich 1129 days ago

A "serious use case" means it needs to be available around the clock.

link