Hacker News new | ask | show | jobs
by martinald 345 days ago
You're missing another big advantage is cost. If you can do 1000tok/s on a $2/hr H100 vs 60tok/s on the same hardware, you can price it at 1/40th of the price for the same margin.
1 comments

You can also slow down the hardware (say, dropping the clock and then voltages) to save huge amounts of power, which should be interesting for embedded applications.
out of curiosity, is anyone here using AI in embedded with experiences to share? I see NPUs and the like popping up more on credit card and buildroot SBCs I get, but with zero documentation or sample scripts for them.