Y
Hacker News
new
|
ask
|
show
|
jobs
by
cheekygeeky
29 days ago
> He removed MLP from Qwen and the model still could do transformation tasks on input but lost knowledge.
But not deterministic?