|
|
|
|
|
by bbtc3453
102 days ago
|
|
This is impressive. I've been experimenting with Gemini API for a side project and the latency difference between local and cloud inference is something I keep thinking about. How does memory usage scale with the 500B models? |
|