| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by maneesh 1521 days ago
	Brilliant. Is the code public?

2 comments

alexpetros 1521 days ago

Unfortunately our agreement with the AI platform powering the project currently prohibits us from sharing the model. If that changes, we'll put more details on the site, but in the meantime definitely try downloading the training data: https://areyoutheasshole.com/training-data

The site itself is a pretty straightforward SvelteKit site: https://kit.svelte.dev/

link

minimaxir 1521 days ago

The dataset they used is public, the models and the training process they used are not.

https://areyoutheasshole.com/about https://areyoutheasshole.com/training-data

link

fpgaminer 1521 days ago

> pushshift.io, a website and database which logs of all of the posts that go on Reddit when they get posted

Such a great resource. It's surprisingly easy to build your own massive datasets using it. I re-derived WebText2, used for training GPT-3, just on a home machine. And with some image scraping you can build up image datasets for training interesting GAN models.

> the training process they used are not.

Seems like it'd be fairly straightforward to finetune an existing language model . GPT-3 if you've got spare change, GPT-J-6B can be finetuned in Colab for free, and GPT-NeoX-20B could be finetuned for free/cheap. Use simple concats of AITA posts followed by a top comment. Balance for NTA/YTA like the Training Data page mentions, and I'll bet you'll get comparable results.

That said, the _idea_ of this bot is really cool and fun.

link

minimaxir 1521 days ago

Straightfoward to tune, but given the dataset size it would require a substantial amount of compute, more than what a Colab can provide without timing out.

The comments by the creators imply they used some sort of SaaS for both training and deployment.

link