|
|
|
|
|
by brucethemoose2
1060 days ago
|
|
Its been awhile since I looked into this, thanks. As a random aside, I hope y'all publish a SDXL repo for local (non webgpu) inference. SDXL is too compute heavy to split/offload to cpu like Llama.cpp, but less ram heavy than llms, and I'm thinking it would benefit from TVM's "easy" quantization. It would be a great backend to hook into the various web UIs, maybe with the secondary model loaded on an IGP. |
|