Hacker News new | ask | show | jobs
by navbaker 534 days ago
At the 48GB level, L40S are great cards and very cost effective. If you aren’t aiming for constant uptime on several >70B models at once, they’re for sure the way to go!
2 comments

> L40S are great cards and very cost effective

from https://www.asacomputers.com/nvidia-l40s-48gb-graphics-card....

nvidia l40s 48gb graphics card Our price: $7,569.10*

Not arguing against 'great', but cost efficiency is questionable. for 10% you can get two used 3090. The good thing about LLMs is they are sequential and should be easily parallelized. Model can be split in several sub-models, by the number of GPUs. Then 2,3,4.. GPUs should improve performance proportionally on big batches, and make it possible to run bigger model on low end hardware.

Dual 3090s are way cheaper than the l40s though. You can even buy a few backups.
Yeah, I’m specifically responding to the parent’s comment about the 48GB tier. When you’re looking in that range, it’s usually because you want to pack in as much vram as possible into your rack space, so consumer level cards are off the table. I definitely agree multiple 3090 is the way to go if you aren’t trying to host models for smaller scale enterprise use, which is where 48GB cards shine.