|
|
|
|
|
by p1esk
2427 days ago
|
|
You’re comparing sentence classification done using transformer embeddings to older results which use inferior embeddings. How do regular convnets do when you feed them transformer embeddings? Re learning reverse graphics - ok, maybe it is indeed the main feature of your work. I’d need to look into that, because from skimming your paper it’s not immediately clear what’s going on there. Re convnet accuracy on Norb - I’m willing to make that effort for cifar-10 as soon as you have the results. |
|
Actually, I'm comparing it to recent models, including XLNet, MT-DNN, Snorkel, and (of course) BERT. AFAIK, convnets have not been able to outperform multihead self-attention, even on pretrained embeddings.
> Re learning reverse graphics - ok, maybe it is indeed the main feature of your work. I’d need to look into that, because from skimming your paper it’s not immediately clear what’s going on there.
I agree, it's not immediately clear. Nonetheless, I find it kind of unbelievable that a model with so few parameters can seem to do it. (I was shocked when I first saw the plots.)
> Re convnet accuracy on Norb - I’m willing to make that effort for cifar-10 as soon as you have the results.
That's a little disappointing... but OK.
Thank you so much for all your questions :-)