Hacker News new | ask | show | jobs
by sigmoid10 845 days ago
The architecture is completely public. I would be surprised if certain other players (including but not limited to Mistral AI) are not training models yet. We'll hear soon enough if this is viable. Maybe not for official release candidates, but at least for internal testing.
1 comments

Nonetheless, this is extremely exciting, unlike RWKV and Retention Network
Why? From what I read those architectures have many similarities (and same weaknesses)