|
|
|
|
|
by ramesh31
1009 days ago
|
|
>So the comparison would be the cost of renting a cloud GPU to run Llama vs querying ChatGPT. Yes, and it doesn't even come close. Llama2-70b can run inference at 300+tokens/s on a single V100 instance at ~$0.50/hr. Anyone who can should be switching away from OpenAI right now. |
|