|
|
|
|
|
by yaantc
805 days ago
|
|
Use llamafile [1], it can be as simple as downloading a file (for mixtral, [2]), making it executable and running it. The repo README has all the info, it's simple and downloading the model is what takes the most time. In my case I got the runtime detection issue (explained in the README "gotcha" section). Solved my running "assimilate" [3] on the downloaded llamafile. [1] https://github.com/Mozilla-Ocho/llamafile/
[2] https://huggingface.co/jartine/Mixtral-8x7B-Instruct-v0.1-llamafile/resolve/main/mixtral-8x7b-instruct-v0.1.Q5_K_M.llamafile?download=true
[3] https://cosmo.zip/pub/cosmos/bin/assimilate
|
|