Hacker News new | ask | show | jobs
by gnunez 494 days ago
Great work! I love your videos; they've taught me so much. Any plans for a Mixture of Experts (MoE) video? My understanding is that starting from GPT4 most advance models use MoE to some extent. For example, can I take the model from your GPT2 video and just change the feed forward layer to an MoE layer like the one found here (1)? I guess I can just try it myself but I enjoy the expert guidance you provide in your videos. Please don't stop! great content!

1. https://github.com/mistralai/mistral-inference/blob/main/src...