|
|
|
|
|
by fleventynine
42 days ago
|
|
My point is that it is WAY more efficient if we put the world's DRAM supply into a shared inference pool instead of stranding it in local machines where it won't have as high of batch size or utilization. The cost of not being efficient is even higher DRAM costs than we have now, given supply and demand. |
|