|
|
|
|
|
by godelski
862 days ago
|
|
The format isn't explicit to the network. But the data trained on is usually in RGB format, so probably the reasoning. I found a repo where someone tried different formats but it's wroth noting that this was for discrimination so just because it can discriminate doesn't mean it does the same thing. Maybe I'll run some experiments. You could use a UNet for classification and then look at the bottom layer and do the same thing. Be hard to do with SD (or SDXL) because you'd need to retrain with the format. Tuning could possibly work but the network would likely be biased to understand the RGB encoding. Edit: ops, forgot the link https://github.com/ducha-aiki/caffenet-benchmark/blob/master... |
|
It's trivial to convert the values for training - basically 0% of the cost of the process. But there's likely more "meaning" in HSV than in RGB. So I don't think that would account for the difference.