Y
Hacker News
new
|
ask
|
show
|
jobs
Helix Parallelism: Sharding Strategies for Multi-Million-Token LLM Decoding
(
research.nvidia.com
)
2 points
by
h6d_100c
344 days ago