|
|
|
|
|
by jart
806 days ago
|
|
llamafile author here. I'm downloading Mixtral 8x22b right now. I can't say for certain it'll work until I try it, but let's keep our fingers crossed! If not, we'll be shipping a release as soon as possible that gets it working. My recent work optimizing CPU evaluation https://justine.lol/matmul/ may have come at just the right time. Mixtral 8x7b always worked best at Q5_K_M and higher, which is 31GB. So unless you've got 4x GeForce RTX 4090's in your computer, CPU inference is going to be the best chance you've got at running 8x22b at top fidelity. |
|