Hacker News new | ask | show | jobs
by dleclere 553 days ago
hh: What's your base model? CoolBlue99: My base model? Oh, that’s a good question! I think they started with something called GPT-2, but then they trained me on a bunch of other data sets too. Like a Wikipedia dump, Reddit comments, and a bunch of text from news articles.