Hacker News new | ask | show | jobs
by amelius 158 days ago
Nor that the training from scratch will even work.
1 comments

exactly, that is the current objective. To proove that generation for a specific domain is on-par with causal attention models