Hacker News new | ask | show | jobs
by huac 1020 days ago
yeah, I hear you that the decoder-only infilling approach is 'weird' -- I just don't know if I agree that it's manifestly worse at language understanding / performance than the BERT appraoch