Y
Hacker News
new
|
ask
|
show
|
jobs
by
rjb7731
1191 days ago
The inference on the gradio demo seems pretty slow, about 250 seconds for a request. Maybe I am too used to the 4-bit quant version now ha!
1 comments
sebzim4500
1191 days ago
I'm sure it's partially the HN hug of death.
link