Y
Hacker News
new
|
ask
|
show
|
jobs
by
ipsum2
1203 days ago
How did you deal with data contamination?
1 comments
vov_or
1203 days ago
The datasets we used are pretty clean themselves if we compare them with LAION. But we also filtered out images with captions on them and by CLIP's scores. Btw, huge thanks for Laion and Open_clip projects! It inspires us a lot.
link