Hacker News new | ask | show | jobs
by wejick 1174 days ago
Seems like llama derived model are flourishing. However with llama is licensed as academic only and noncommercial model, what is the path for bringing this to production of for profit purpose?

I certainly interested doing so.

4 comments

The methodology for alpaca has proven powerful and it's being applied to model with better licensing. It's hard to track lineage, but I think openassistant models are the most permissive at the moment, they use a openly sourced set of data to build an instruct model on top of phiia, which itself is a gptneox trained on a duplicated version of the famous the pile dataset.

The problem is verifying the licensing claims for these composed solutions is becoming exceedingly hard.

Almost everything in AI now breaks Americas copyright principles.
The Silicon Valley ethos has always been - do it first worry about legality later. If you go bust - nobody will care. If you become small - you will be ignored. If you go big - lawyers will figure something out to cut a deal.
That is a thoroughly bankrupt ethos that should be denounced every time it pops up. It is literally condoning criminality.
No crimes in this case, just license breaches. After a few training iterations, it’ll be very muddled anyways.
Yes, I was speaking of the general ethos, not a specific case. But let's take Uber as an example of that ethos in action -- Uber committed actual crimes as part of their growth strategy.
Airbnb and Uber have had lots of crimes
"Won't somebody think of the poor defenseless corporations?!"
Copilot style. Train a distilled model based on it, and now it's a new model unencumbered by copyright.
> llama is licensed as academic only and noncommercial model

Are weights even copyrightable? I was under the impression that they weren't (although it hasn't been tested, and there's a chance they may run afoul of database rights).