Hacker News new | ask | show | jobs
by Iv 288 days ago
"Starting from a single base LLM"

Ok, zero data, except the data used in the teacher model.

1 comments

Only 1-15TB of data processed at $10k-$100m depending on model size. Then, this shaves off a few hundred to a few grand on fine-tuning. I mean, we're still saving money at least.