|
|
|
|
|
by theapadayo
39 days ago
|
|
IMO the biggest thing still missing is an actual way to define the model architecture outside of being hard coded into the current build. It doesn't need to be a 1:1 performance parity with the fully supported models. Having proper, vendor validated support for day 1 is what is the difference between people thinking a model is amazing vs horrible. See recent Gemma vs Qwen releases. Not sure what the solution is, other than writing a DSL to describe the model graphs which you then embed in the GGUF. The other fallback is to just read the PyTorch modules from the official model releases and convert that to GGML ops somehow. |
|
I'd still love to see this, but it would need a cheerleader very familiar with the current state of the GGML IR.