Hacker News new | ask | show | jobs
by denimboy 895 days ago
mergekit is the tool you need to do this

  https://github.com/cg123/mergekit
you can slice off layers and blend models with different strategies.
1 comments

Mergekit is the best thing since sliced bread, as the local llm community already knows.

The dev's blog is great: https://goddard.blog/posts/

...But its not what this paper is describing. They are basically alternating models, AFAIK. Also I have other nitpicks with the paper, like using extremely old/mediocre chat models as bases:

> Pygmillion 6B, Vicuna 13B, Chai Model 6B