Hacker News new | ask | show | jobs
by zavertnik 936 days ago
> Building on top of any of these platforms provided by trillion dollar companies is a sucker's game.

Until local models reach the fidelity and speed that these megacorps offer, what choice does anyone actually have with respect to AI? I was under the impression that even if you get over the initial cost of hardware to achieve speed, the fidelity of your outputs would still be of a lower overall quality relative to GPT/Claude/Bard(maybe?). I could be 100% wrong though.

1 comments

The gap is closing. I'm finding goliath-120b does better than chat gpt 3.5

Nothing comes close to gpt4 though

For me, the gap between 3.5 and 4 is massive. If I'm stuck between using 3.5 and doing the work myself, more often than not, I'm choosing to do it myself. Not to imply 3.5 is unusable, its just my bar for minimum fidelity is closer to 4 than 3.5 with respect to tasks that I'm comfortable offloading onto an AI.

What are you running goliath-120b on? Is it costly to run all day every day? How long does it take to complete an output? I've thought about building a multi GPU node for local LLMs but I always decide against it on the premise that the tech is so new I figure in the next 3-4 years we'll see specialized hardware combined with efficiency improvements that would make my node obsolete.

I run it on 2xRTX3090. I bought them used (probably ex-miners).

> I always decide against it on the premise that the tech is so new I figure in the next 3-4 years we'll see specialized hardware combined with efficiency improvements that would make my node obsolete.

You're probably right, this happened back in the day with bitcoin mining.

How does Goliath-120b improve on llama2-70b by just combining two of them?

https://huggingface.co/alpindale/goliath-120b?text=Hi.

> An auto-regressive causal LM created by combining 2x finetuned Llama-2 70B into one.

I.. don't know. Even the creator of the model doesn't know why it worked out so well.

It really is better (at reasoning) than the 70b models when I use it. Though some people reported that it makes spelling mistakes.

P.S. This doesn't always work out well, people have tried swapping different layers randomly and it makes the models incoherent.