Hacker News new | ask | show | jobs
Nemotron 3 Ultra: Open Moe Hybrid Mamba-Transformer for Agentic Reasoning [pdf] (research.nvidia.com)
23 points by victormustar 18 days ago
2 comments

Is this the one from Jensens Computex presentation the other day?

It is significantly bigger than Qwen for the same level of intelligence, but I think the key strength was inference speed.

This model seems like a really big deal. Is this the biggest Western open-source AI model in the world (beating out Llama3 405B)?