Hacker News new | ask | show | jobs
by yaroslavvb 70 days ago
This was around the time I trained Transformer-XL (outside of OpenAI) with Ben Mann (https://yaroslavvb.medium.com/scaling-transformer-xl-to-128-...) . Originally we wanted to release train and release the weights as a kind of GPT-2.5, but our OpenAI friends pushed us to keep weights closed.
3 comments

> our OpenAI friends

I would be taking a grudge against those "friends" to my grave.

I don't hold grudge, GPT-2 wasn't that great of a model, so releasing it would be more of a publicity value. But the blog post already had that purpose.
This project gave me motivation to build the deep learning next token prediction integration for JetBrains because I was using PyCharm at the time. (Eventually, it wasn't continued because it was kind of expensive to host)
Well you all got fooled didn’t you.

Much like how they claimed to be all about open source.

Same story with Connor Leahy and his GPT-2 clone, though his public articulation of how OpenAI sat him down seems to be glossed over.

"

OpenAI reached out to me almost immediately to talk and they were nothing but respectful and understanding... After making it publicly known what I had done, I was quickly approached by a range of smart people with good arguments. Many of them helped me update my beliefs in light of new evidence...

The day after my announcement, I got to talk to Jack Clark, Alec Radford and Jeff Wu from OpenAI. We had a nice hour long discussion, where I explained where I was coming from, and they helped me to refine my beliefs. They didn’t come in accusing me in any way, they were very clear in saying they wanted to help me gain more important insight into the wider situation. For this open and respectful attitude I will always be grateful. Large entities like OpenAI often seem like behemoths to outsiders, but it was during this chat that it really hit me that they were people just like me, and curious hackers to boot as well.

I quickly began to understand nuances of the situation I wasn’t aware of. OpenAI had a lot more internal discussion than their blog post made it seem. And I found this reassuring. Jack in particular also gave me a lot of valuable information about the possible dangers of the model, and a bit of insight into the workings of governments and intelligence agencies.

After our discussion, I had a lot to think about. But I still wasn’t really convinced to not release. Even some people inside OpenAI were still discussing the not-release policy. So while I definitely had things to consider, I was still mostly set on releasing...

We shouldn’t be angry with OpenAI for what they did. We should applaud them for making a point before it becomes a true problem. Prophylaxis is much better than treatment. I still disagree with some of the things OpenAI did and how they communicated them, but I now understand that sending a message that it is ok, even celebrated, for a lone individual to unilaterally go against reasonable safety concerns of other researchers is not a good message to send. I want to support OpenAI’s message. So, while it might be a small, mostly symbolic gesture, I will not be releasing my model. Some day, someone like me may be in a situation just like mine, but it won’t be GPT2. It might be something much, much more dangerous. And that is the person I am trying to talk to here.

"

https://medium.com/@NPCollapse/the-hacker-learns-to-trust-62...

https://archive.md/1HoGz

Thanks for the share! Didn't realize eleutherai launched around same time