Hacker News new | ask | show | jobs
by Garcia98 1150 days ago
I disagree, they made the decision to use datasets with restrictive licensing, jumping the alpaca/gpt4all/sharegpt bandwagon.

They also chose to toot their horn about how open-source their models are, even though for practical uses half of their released models are not more open source than a leaked copy of LLaMa.

2 comments

So just use their base model and fine-tune with a non-restrictive dataset (e.g. Databricks' Dolly 2.0 instructions)? You can get a decent LoRA fine-tune done in a day or so on consumer GPU hardware, I would imagine.

The point here is that you can use their bases in place of LLaMA and not have to jump through the hoops, so the fine-tuned models are really just there for a bit of flash…

Looks like you’re seeing the glass as half empty here. Not sure if arguing here was more time efficient than just running the eval on the other set of weights.

*I wish I understood these things well enough to not have to ask, but alas I’m just a basic engineer

I use a GPU server and runtime is not free unfortunately.
Ah no worries then. Thanks for your datapoint regardless