Hacker News new | ask | show | jobs
by yodon 42 days ago
>Everything was parallelized on Burla, on a single dynamic cluster that scaled to ~1.7K CPU workers for photo download and CLIP, with 20 A100 GPUs running embedding clusters in parallel on the same cluster.

That's a lot of budget - would have been nice if they'd made an actual donation to the project, instead of pounding the project's servers and bandwidth when there are much better ways to interact with the data.

1 comments

Totally fair callout. I should’ve been more careful here and leaned on the provided datasets / bulk access instead of pulling things at scale. That’s on me.

I’ll make a donation to support the project regardless. Appreciate you raising it.

... so you'd only end up making a donation if you ended up "stressing the project's infra more than expected"?!