Hacker News new | ask | show | jobs
by keonix 943 days ago
Wait until you hear about frankenmodels. You rip parts of one model (often attention heads) and transplant them in another and somehow that produces coherent results! Witchcraft

https://huggingface.co/chargoddard

1 comments

>somehow that produces coherent results

with or without finetuning? Also is there a practical motivation for creating them?

> with or without finetuning?

With, but it's still bonkers that it works so well

>Also is there a practical motivation for creating them?

You could get in-between model sizes (like 20b instead of 13b or 34b). Before better quantization it was useful for inference (if you are unlucky with vram size), but now I see this being useful only for training because you can't train on quants

> With, but it's still bonkers that it works so well

Ehhhh…