|
|
|
|
|
by awuji
336 days ago
|
|
You can already run a large LLM (like sonnet 3.5) locally on CPU with 128GB of ram which is <300 USD, but can be offset by swap space. Obviously, response speed is going to be slower, but I can't imagine people will pay much more than 20 USD for waiting 30-60 seconds longer for a response. And obviously consumer hardware is already being more optimized for running models locally. |
|