Hacker News new | ask | show | jobs
by brucethemoose2 848 days ago
That is awesome!

I hope y'all consider longer context models as well.

Also, are ya'll looking alternative architectures like Mamba? Being "first" with a large Mamba model would cement your architectural choices/framework support like llama did for Meta.