Hacker News new | ask | show | jobs
by edude03 808 days ago
Maybe the only downside to how fast LLMs are moving is papers come out faster than anyone (not at Google) can train and test the improvements.

I got into deep learning around when ReLU and dropout was hot and on my consumer 1080 I was able to change one or two lines of code and test the improvements in a few hours, whereas now, I guess I'll need to wait a few weeks for mistral et al to try it out

1 comments

Welcome to the GPU poor!

I'm focusing in quantization approaches and testing on my obsolete last gen GPUs.

The funny thing is, I have 8 3090s which last epoch would have put in like - top 1% of compute. Now, still a lot of compute but pales in comparison to the 100x H100 GPU clusters we're seeing today.