Hacker News new | ask | show | jobs
by nl 1058 days ago
In the Qualcomm AI paper linked in this post it turns out they use a similar testing approach:

BERT 109M, testing perplexity

OPT 125M, testing perplexity

ViT 22M, testing on ImageNet top-1.