Hacker News new | ask | show | jobs
by wlesieutre 889 days ago
What if you want to start with a subset of the original data? Like you've trained a model, and then later said "You know, this new data we're adding is great, but maybe pulling all those comments from 4chan earlier was a mistake," wouldn't that require starting fresh with access to the actual data?
1 comments

Technically correct but not a very realistic request / approach.

The general idea is to get as good of a mastery of language as possible, generally, and then fine tune to specialize on tasks