|
|
|
|
|
by abra0
847 days ago
|
|
Well if you are not using a rented machine during a period of time, you should release it. Agreed on reliability and data transfer, that's a good point. Out of curiosity, what do you use a 2x3090 rig for? Bulk not time-sensitive inference on down quanted models? |
|
If you're using them for inference, your usage pattern is unpredictable. I could spend hours between having to use it, or minutes. If you shut it down and release it, the host might be gone the next time you want to use it.
> what do you use a 2x3090 rig for? Bulk not time-sensitive inference on down quanted models?
Yeah. I can run 7B models unquantized, ~13-33B at q8, and ~70B at q4, at fairly acceptable speeds (>10tk/s).