Hacker News new | ask | show | jobs
by ZeroTalent 408 days ago
Look into groq.com guys. some good models at similar speed to inception labs
2 comments

It's faster inference because of the Hardware (LPUs), here the question is about architectures (AR or Diffusions)
I realize that, but it can be used now with many models in real-life situations. I just wanted to mention it if someone doesn't know it.
SRAM doesn't scale with advanced semiconductor node.

Groq is heading to a dead end.