Y
Hacker News
new
|
ask
|
show
|
jobs
by
storystarling
145 days ago
I thought the RTX 6000 Ada was 48GB? If you have 96GB available that implies a dual setup, so you must be relying on tensor parallelism to shard the model weights across the pair.
1 comments
embedding-shape
145 days ago
RTX Pro 6000 - 96GB VRAM - Single card
link