Hacker News new | ask | show | jobs
by rdedev 817 days ago
Do you think there is a fundamental difference between masked language modelling vs causal language modelling? I feel like most LLMs are decoder only models just cause they are easier to train because their attention mask is fixed