|
|
|
|
|
by joefourier
1062 days ago
|
|
You don’t have to run Llama 70B on a rented 2xA100 80GB which is of course going to be quite pricy. Quantising it to 4-bit as brucethemoose2 mentioned allows you to run it on far cheaper hardware - it’ll fit on a single A6000 which can be rented for as low as $0.44/h, 10x cheaper than the $4.42/h they mentioned for their 2x A100 80GB (speed might be impacted but it shouldn’t be 10x slower). And if you’re running it on your own machine, then the cost of using Llama is just your electricity bill - you can theoretically run it on 2x 3090 which are now quite cheap to buy, or on a CPU with enough RAM (but it will be very very slow). |
|