|
|
|
|
|
by marginalia_nu
634 days ago
|
|
If I run some simple inference locally on a 4090 (450 TDW card) it takes order of seconds and that sucker's going full blast, you're looking at order of 1 kJ, which is significantly higher than what is quoted in the article. Article numbers line up better with CPU inference for ~1s. |
|