Hacker News new | ask | show | jobs
by sixo 321 days ago
No, it's what you do if your model architecture is capped out on its ability to profit from further training. Hand-wrapping a bunch of sub-models stands in for models that can learn that kind of substructure directly.