Hacker News new | ask | show | jobs
by maneesh 1521 days ago
Brilliant. Is the code public?
2 comments

Unfortunately our agreement with the AI platform powering the project currently prohibits us from sharing the model. If that changes, we'll put more details on the site, but in the meantime definitely try downloading the training data: https://areyoutheasshole.com/training-data

The site itself is a pretty straightforward SvelteKit site: https://kit.svelte.dev/

The dataset they used is public, the models and the training process they used are not.

https://areyoutheasshole.com/about https://areyoutheasshole.com/training-data

> pushshift.io, a website and database which logs of all of the posts that go on Reddit when they get posted

Such a great resource. It's surprisingly easy to build your own massive datasets using it. I re-derived WebText2, used for training GPT-3, just on a home machine. And with some image scraping you can build up image datasets for training interesting GAN models.

> the training process they used are not.

Seems like it'd be fairly straightforward to finetune an existing language model . GPT-3 if you've got spare change, GPT-J-6B can be finetuned in Colab for free, and GPT-NeoX-20B could be finetuned for free/cheap. Use simple concats of AITA posts followed by a top comment. Balance for NTA/YTA like the Training Data page mentions, and I'll bet you'll get comparable results.

That said, the _idea_ of this bot is really cool and fun.

Straightfoward to tune, but given the dataset size it would require a substantial amount of compute, more than what a Colab can provide without timing out.

The comments by the creators imply they used some sort of SaaS for both training and deployment.