Hacker News new | ask | show | jobs
by Taek 1144 days ago
How is this different from what RedPajamas is doing?

Also, most people don't mind running LLaMA 7B at home so much because of enforceability, but a lot of commercial businesses would love to run a 65b parameter model if possible and can't because the license is more meaningfully prohibitive in a business context. Open versions of the larger models are a lot more meaningful to society at this point.

2 comments

RedPajama is creating a dataset. This is a permissively licensed model trained on that dataset.
RedPajama is also training both foundation and instruct-tuned models

Source: https://twitter.com/togethercompute/status/16527350961501757...

I agree with this. For a lot of companies hundreds of thousands of dollars or single digit millions on fine tuning, inference, and so on is entirely feasible but using model weights with clouded legal status isn’t.