Hacker News new | ask | show | jobs
by PBnFlash 980 days ago
The way these systems work feel massively inefficient.

We don't re-evaluate our astrophysics models when reading a cooking book.

2 comments

Neither does GPT-4 or other sparse mixtures of experts, such as e.g. switch transformers [1].

[1] https://arxiv.org/abs/2101.03961

Speak for yourself :)