Hacker News new | ask | show | jobs
by sillysaurusx 2405 days ago
I was showing someone how GPT-2 could generate human-like text, and an innocent prompt ended up generating a very NSFW story. https://imgur.com/a/tsU82TS

One company recently released a model, but refused to release the decoder. Apparently they had trained it on some Reddit posts (or something like that) and the results were sometimes so offensive that the company wouldn't risk their reputation by releasing the decoder.

I think AI is going to reveal some unsettling things about human nature. For example, I was trying to train a model to morph someone's ethnicity (https://twitter.com/theshawwn/status/1184074334186414080) and ran straight into the problem of bias: black people are much less represented in FFHQ, the photo database the StyleGAN model was trained on. I had to gather several thousand datapoints, much more than other groups.

It was a fascinating look into bias in ML -- bias is a real thing that will affect our results, and it's important for you to go out of your way to correct for them when they affect people. The early model was so bad that if it was a corporation doing the work, they might have just scrubbed the project. But after a few thousand datapoints, it's a very convincing transformation now.

The future of AI generated content is just fascinating and delightful. And yes, scary. But it's like we're on the edge of... it's hard to put into words. Part of the reason I got into AI was to see what was hype vs what was real. And while we probably won't see AGI, I think we will see endless automated remixing. Imagine having a "blog synth" a few orders of magnitude more sophisticated than this, or an instrument that you can play like a pro within a few minutes. Can't wait for the good stuff.

2 comments

> One company recently released a model, but refused to release the decoder. Apparently they had trained it on some Reddit posts (or something like that) and the results were sometimes so offensive that the company wouldn't risk their reputation by releasing the decoder.

This reminds me of Markov Polov, a markov-chain twitter bot that uses the tweets of its followers as a learning corpus. It was suspended for harassment.

https://twitter.com/markov_polov

It is true that AI can carry bias and often produce results that are unexpected and offensive. It was clearly put into simpler terms by a ted talk I got to watch recently:

The danger of AI is weirder than you think (https://www.youtube.com/watch?v=OhCzX0iLnOc)

In terms of maturity, the AI we have now is much closer to a statistical analytics engine than to the all knowing AI governments shown in sci-fi, which is to say that it is in its very early stages.

I can't wait for the good stuff, but I'm also concerned that there's going to be multiple unexpected ripple effects in the path towards that goal.

The linked TED talk is by Janelle Shane, an optics research scientist and an AI researcher. She maintains a blog, which is quite funny and has also written a book drawn from her experiences with NN's, particularly GPT-2.

https://aiweirdness.com/