Hacker News new | ask | show | jobs
by radq 812 days ago
1/3rd "activated parameters", while also requiring 2x the VRAM.
1 comments

That's the point of MoE. Sacrificing VRAM for compute/RAM bandwidth which makes it harder sell for consumer devices but easier for server devices where things are more likely to be compute or RAM bandwidth bound.