| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by minimaxir 811 days ago
	That's definitely one way to leverage the GPU VRAM hardware inflation intended for LLM model training.

1 comments

CaptainOfCoit 811 days ago

I'm fairly confident that most of the hardware you see available today (for consumers) wasn't specifically designed with LLMs in mind.

link

minimaxir 811 days ago

Sure, the 8GB VRAM gaming GPUs aren't designed for LLMs (and would effectively get zero benefit from the data throughput of GPU-accelerated data frames compared to typical approaches), but the 80GB A100s server GPUs definitely are.

link

CaptainOfCoit 810 days ago

> but the 80GB A100s server GPUs definitely are

I'm sure LLMs were considered, like many other ML use cases, but that A100 was intended for LLMs? I'm unsure about that.

A100 was released the same year as GPT3, and it wasn't until GPT3 went live that people really started pay attention. Then I'm sure designing and producing a GPU takes a longer time than a couple of months.

link