Hacker News new | ask | show | jobs
by nefitty 1658 days ago
There are open versions of GPT so presumably, one could train an open version of Codex. It would be really terrible if Codex was made inaccessible to swaths of people in the name of commercial interests.
1 comments

GitHub sent OpenAI something like 57 terabytes of data from GitHub. Good luck scraping that.

(I helped build The Pile, the largest openly-available text dataset.)

You're right that you theoretically can do this, but doing it in practice requires either funding or time.

Yeah, I thought of mentioning that but wasn't sure how in the weeds anyone would want to go lol Besides, I'm optimistic about what an enterprising individual is capable of when faced with those sorts of limits... It's those clear bounds that set creativity free.

By the way, I'm so fucking stoked that Shawn Presser of The Pile responded to me. Your work is proto-solarpunk incarnate. Really amazing contributions dude, can't wait to see what's next.

I'm really happy to hear that. Thank you.

When I started out, I only wanted to make some small contribution somewhere. It's really surreal that there are people rooting for me now. I'll do my best to continue to contribute in ways that I can.

You can too, by the way. There's not a lot of difference between me and you. I believe in you.

I was literally sitting here trying to stop the waves of sadness I'm feeling from not meeting my own expectations. Bumping into a kindred spirit that's getting shit done really helps. Thank you for the nudge.
No stress, friend :) Remember, small contributions really matter! You can do it!