Hacker News new | ask | show | jobs
by oneshtein 584 days ago
Almost working:

  [2024-11-09 19:33:55.214] [info] Initializing QuantizedFluxModel
  [2024-11-09 19:33:55.359] [info] Loading weights from ~/.cache/huggingface/hub/models--mit-han-lab--svdquant-models/snapshots/d2a46e82a378ec70e3329a2219ac4331a444a999/svdq-int4-flux.1-schnell.safetensors
  [2024-11-09 19:34:01.432] [warning] Unable to pin memory: invalid argument
  [2024-11-09 19:34:02.143] [info] Done.
  terminate called after throwing an instance of 'CUDAError'
    what():  CUDA error: pointer does not correspond to a registered memory region (at /nunchaku/src/Serialization.cpp:32)
1 comments

prolly make sure your host machine cuda is also 12.4 and if not, update the other cuda versions I have on the pastebin to the one you have. I don't think it works with cuda 11.8 tho, remember trying it once

but yea, can't help you outside of runpod, I haven't even tried this on my home PCs yet. for my usecase of serverless API, it seems to work