I think the point is more the demonstration that such a small model can have such good performance than any actual usefulness.
I've just assumed it's down to how it was trained, but no expert.
I've just assumed it's down to how it was trained, but no expert.