Hacker News new | ask | show | jobs
by popinman322 1194 days ago
Is there something I'm missing? ChatLlama doesn't reference any human feedback datasets.

> You can of course also cross train it using actual ChatGPT.

You mean train it on ChatGPT's output? That's against OpenAI's terms of service.

2 comments

> You mean train it on ChatGPT's output? That's against OpenAI's terms of service.

Oh no, someone call the internet police.

I'm sure scraping tons and tons of images and web data to train DALLE and GPT and then selling access to that data to others was also against many licenses and terms of services, but OpenAI did those anyway.

None of these AIs were created ethically. At the very least we can make sure these huge models don’t solely belong to monopolistic tech companies and democratize their power.
You’re missing something. Both SHP (https://huggingface.co/datasets/stanfordnlp/SHP) and OpenAssistant datasets are referenced.

And the TOS violation might be the case, the project nevertheless has a mode to use OpenAI in the fine tuning steps.