Hacker News new | ask | show | jobs
by mluo 488 days ago
It's simply bc the model is small (1.5B), making it sensitive to weight perturbations