reading here says "the behavior and qualities of these large models is poorly understood"
prove me wrong?
ps- I agree that BERT-related models have been "wildly popular in NLP for years now"
reading here says "the behavior and qualities of these large models is poorly understood"
prove me wrong?
ps- I agree that BERT-related models have been "wildly popular in NLP for years now"