| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rahen 78 days ago

Yes. The Cray supercomputers from the 80s were crazy good matmul machines in particular. The quad-CPU Cray X-MP (1984) could sustain 800 MFLOPS to 1 GFLOPS, and with a 1 GB SSD, had enough computer power and bandwidth to train a 7-10M-parameter language model in about six months, and infer at 18-25 tok/sec.

A mid-90s Cray T3E could have handled GPT-2 124M, 24 years before OpenAI.

I also had a punch-card computer from 1965 learn XOR with backpropagation.

The hardware was never the bottleneck, the ideas were.

2 comments

lucasfin000 77 days ago

Post-quantum crypto is a good example of this. Lattice-based schemes were theorized in the 90s, but they took decades to actually reach production. The math existed, the hardware existed, and the ideas for making it work were just not there yet.

link

CamperBob2 78 days ago

The hardware was never the bottleneck, the ideas were.

For sure. Minsky and Papert really set us back.

link

Onavo 78 days ago

They should have lived to see the results of the bitter lesson.

link

CamperBob2 77 days ago

Minsky came close (d. 2016) -- although he may have had other interests later in life, if the Epstein file dumps are to be believed.

link