Hacker News new | ask | show | jobs
by christianqchung 325 days ago
A little bit of training data certainly has gotten in there, but I don't see any reasons for them to deliberately distill from such an old model. Models have always been really bad at telling you what model they are.