Hacker News new | ask | show | jobs
by Squeeze2664 323 days ago
How do you determine the importance of a layer in this case?
2 comments

Afaik they have a test bench that they use and take the activation data from that.
Yes we have around 1 to 3 million tokens of high quality self verified data that we use to calibrate models!