|
|
|
|
|
by Philpax
990 days ago
|
|
I more or less agree with you (I'm not convinced that training models on the imagery of the internet isn't fair use), but I wouldn't rule out a CC0 model just yet. There's Mitsua Diffusion One [0], which doesn't produce incredible results, but it's a start and they're planning on adding more data, including opt-in work from artists. PIXART-alpha [1] was trained on only 25 million images, and has excellent and competitive results. This could pair well with Fondant AI's 25 million Creative Commons-only dataset [2] (not all CC0, but a sizeable amount). I don't think it's as far away as you think it is! [0]: https://huggingface.co/Mitsua/mitsua-diffusion-one [1]: https://pixart-alpha.github.io/ [2]: https://huggingface.co/datasets/fondant-ai/fondant-cc-25m |
|