Hacker News new | ask | show | jobs
by bcatanzaro 60 days ago
Some of these concerns are precisely why we are building Nemotron at NVIDIA. We want to make it possible for people to invent and deploy AI in all sorts of new and unforeseen ways.

Nemotron is:

1. Not just open weight, but open data (to the limits of what is feasible), open recipe, open technique

2. In the future built by a coalition of organizations coming together to build great openly developed AI.

Nemotron 3 Super is our most successful model yet. [1]

Ultra is coming soon. And then Nemotron 4.

We can afford to do this because when AI grows, NVIDIA's opportunity also grows.

[1] https://kaitchup.substack.com/p/the-fastest-and-cheapest-120...

2 comments

Nemotron is great, keep up the good work!
how do you justify the compute investment for something like nemotron ? especially if all the labs are willing to pay for those same GPU clusters for inference or training runs?
Nemotron has two reasons to exist, both of which are strategic to NVIDIA.

1. Help NVIDIA design future systems for AI by more deeply understanding what it takes to build AI.

2. Keep the AI ecosystem strong and diverse throughout the world by providing AI infrastructure that many companies can innovate on.

This is not a science project, nor is it for the joy of giving something away. Both of these reasons are core to NVIDIA.

Does Nvidia maintain it's own compute hardware expressly for model training? Otherwise, I'm not sure how you keep up with the SOTA model techniques.
Yes