Hacker News new | ask | show | jobs
by muttled 900 days ago
It is if you can pool the memory. It's easier than having to do the split of the models in software (though that's a somewhat solved problem) and from what I know allows higher GPU utilization on both cards when they don't have to wait for information to pass back and forth.