Hacker News new | ask | show | jobs
by boywitharupee 654 days ago
what kind of model architecture was used for this? is it safe to assume they used a transformer model or a variant of it?