|
|
|
|
|
by Vt71fcAqt7
612 days ago
|
|
That sounds like it could be an intiresting metric. Worth noting that there is a difference between an algorithmic "best of n" selection (via eg. an FID score) vs. manual cherry picking which takes more factors into account such as user preference and also takes time to evaluate, which is what GP was suggesting. |
|
Better metrics (assuming goal is text->image) would be some sort of inception score or CLIP-based text matching score. These metrics are computable on single samples.