Hacker News new | ask | show | jobs
by huac 634 days ago
One guess is that the live demo is quantized to run fast on cheaper GPUs, and that degraded the performance a lot.