Hacker News new | ask | show | jobs
by sidnb13 984 days ago
> I also believe that within say 1-3 years there will be a different type of training approach that does not require such large datasets or manual human feedback.

I guess if we ignore pretraining, don't sample-efficient fine-tuning on carefully curated instruction datasets sort of achieve this? LIMA and OpenOrca show some really promising results to date.

1 comments

distilbert was trained from Bert. there might be an angle using another model to train the model especially if your trying to get something to run locally.