|
|
|
|
|
by ACCount37
84 days ago
|
|
There's "cheap" bulk data - simple synthetics, unfiltered scrapes. Used for pre-training, especially early pre-training. And then there's "expensive" data. Human domain expert solutions, made by people you hire for $100 an hour. Used for SFT. For "expensive" data, it makes a lot of sense to use every trick in the book to squeeze that data for all its worth. |
|