Hacker News new | ask | show | jobs
by squeaky-clean 1390 days ago
> What are the Windows 10 with nice Nvidia chips w/ CUDA getting?

Are you referring to single iteration step times, or whole images? Because obviously it depends on the number of iteration steps used.

Windows 10, RTX 2070 (laptop model), lstein repo. I get about 3.2 iter/sec. A 50 step 512x512 image takes me 15 seconds.

3 comments

I’m referring to there being a community effort to normalize performance metrics and results at all, with the M1 devices being in that list as well, so that we dont have to ask these questions to begin with

Are you aware of any wiki or table like that?

Huh, that’s the same speed I get on Collab. Pretty good.
I only run 1 sample at a time (batch size 1), forgot to mention that, and that affects the step time.

It looks like each additional image in a batch is cheaper than the 1st image. For example if I reduce my resolution so I can generate more in a single batch

1 image, 50 steps, 320x320: 5s

2 images, 50 steps, 320x320: 8s

3 images, 50 steps, 320x320: 11s

4 images, 50 steps, 320x320: 14s

And the trend continues, and my reported iteration/sec goes down as well. It's not accounting for the fact that with steps=50 and batch size=4 it's actually running 200 steps, just in 4 parallel parts.

Wow, that is over twice as fast as my Windows 11, RTX 3080ti
I just commented on another sibling comment (too late to edit the first one), but I forgot to mention my batch size is only 1. I think most people use batch size 4, so basically multiply my time by your batch size for a real comparison.
It was my bad, my script was still running a different fork. Seeing <10 second times with those parameters now. 13.6 seconds for an 3072 × 2048 upscaled image, which I'm particularly happy about.