Y
Hacker News
new
|
ask
|
show
|
jobs
by
phi-go
301 days ago
Does this have a compute benefit or could one use different specialized LLM architectures / models for the subnetworks?