Y
Hacker News
new
|
ask
|
show
|
jobs
by
nestorD
845 days ago
MoE let's you use scale model size up with compute. That leads to hopefully more intelligent models. It, however, is independent with context size: the ability to process a lot of tokens / text.