|
|
|
|
|
by westurner
230 days ago
|
|
How often do hardware optimizations get created for lower level optimization of LLMs and Tensor physics? How reconfigurable are TPUs? Are there any standardized feature flags for TPUs yet? Is TOPS/Whr a good efficiency metric for TPUs and for LLM model hosting operations? From https://news.ycombinator.com/item?id=45775181 re: current TPUs in 2025; "AI accelerators" : > How does Cerebras WSE-3 with 44GB of 'L2' on-chip SRAM compare to Google's TPUs, Tesla's TPUs, NorthPole, Groq LPU, Tenstorrent's, and AMD's NPU designs? |
|
> How often do hardware optimizations get created for lower level optimization of LLMs and Tensor physics?
LLMs? all the time? "tensor physics" (whatever that is) never
> How reconfigurable are TPUs?
very? as reconfigurable as any other programmable device?
> Are there any standardized feature flags for TPUs yet?
have no idea what a feature flag is in this context nor why they would be standardized (there's only one manufacturer/vendor/supplier of TPUs).
> Is TOPS/Whr a good efficiency metric for TPUs and for LLM model hosting operations?
i don't see why it wouldn't be? you're just asking is (stuff done)/(energy consumed) a good measure of efficiency to which the answer is yes?