|
|
|
|
|
by declaredapple
842 days ago
|
|
Yeah the output pricing I think is really interesting, 150% more expensive input tokens 250% more expensive output tokens, I wonder what's behind that? That suggests the inference time is more expensive then the memory needed to load it in the first place I guess? |
|
Probably that and what you mentioned.