Hacker News new | ask | show | jobs
by ergot_vacation 1863 days ago
No need to resort to goofy phrases like "bougie tech bro." They're just out-of-touch rich people. Same as it ever was.

If that's your concern though, the good news is that in its purest form, machine learning tends to bend AWAY from this. You need large data sets to get good results, which means these projects tend to sample huge chunks of the general Internet, not just the isolated bubbles of SV types. Of course this still has limits, any data set has limits. You can only scrape data from the net if someone has posted that data in the first place, for example.

But in their initial form, a lot of these models are pretty diverse. That's why AI Dungeon had all kinds of "objectionable" content that kept getting the always-offended on their case: GPT-3 is just built off the general Internet, including a lot of weird, fucked up shit. The real problem is that inevitably someone complains, and they start hacking away at the ideal model to try to make it squeaky clean and ruin it in the process.

If you want to keep the tech from being perverted by "bougie tech bros," focus on the censorship. The models often start off pretty good.