Hacker News new | ask | show | jobs
by michaelnny 734 days ago
I'm wondering if the tensor parallel settings have any impact on the performance. My naive guess is yes but not sure.

According to the article: """ AMD Configuration: Tensor parallelism set to 1 (tp=1), since we can fit the entire model Mixtral 8x7B in a single MI300X’s 192GB of VRAM.

NVIDIA Configuration: Tensor parallelism set to 2 (tp=2), which is required to fit Mixtral 8x7B in two H100’s 80GB VRAM. """

1 comments

I personally find such comparisons unfair. A good comparison should optimize for each device configuration, which means use a model within the VRAM limit and quantize to 8 bits where it boosts performance etc and avoid shortcomings of both devices unless necessary.