Hacker News new | ask | show | jobs
by kartoolOz 925 days ago
Hi, Thanks for Open sourcing the code! I was trying to reuse the code especially the dynamic quantization per channel (int8 on gpu) but couldn't get it to work, i also checked out torchao package but it looks like it has dependency on the nightly channel and SAM's dynamic implementation with triton has other issues, is there any clean implementation of int8 dynamic post-training quantization that you can point too ?
1 comments

What’s the issue with getting int8 dynamic quantization to work? As in, you’re unable to get it to quantize or to run with speedups?