|
|
|
|
|
by verdverm
919 days ago
|
|
This sounds like the methodology from "Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes" i.e. master teaches apprentice or LLM trains SLM https://arxiv.org/abs/2305.02301 (May '23) |
|