Hacker News new | ask | show | jobs
by cpldcpu 194 days ago
What also cannot be ignored, is that transformer models are a great unifying force. It's basically one architecture that can be used for many purposes.

This eliminates the need for more specialized models and the associated engineering and optimizations for their infrastructure needs.

1 comments

And if better models than transformers are found? Or if someone finds models that do not rely on GPUs or specialized hardware?

Neither the hyperscalers nor NVDA are safe from uncertainty.