Hacker News new | ask | show | jobs
by wilsonzlin 777 days ago
Hey, thanks for the kind words. I wasn't able to mention the costs in the post (might follow up in the future) but it was in the hundreds of dollars, so was reasonably accessible as a hobby project. The GPUs were surprisingly cheap, and was only scaled up mostly because I was impatient :) --- the entire cluster only ran for a few hours.

Do you have any links to your work? They sound interesting and I'd like to read more about them.

1 comments

"Hundreds of dollars" sounds a bit painful as an EU engineer and entrepreneur :), but I guess it's all relative. We would think twice about investing this much manpower and compute for such an exploratory project even in a commercial setting if it was not directly funded by a client.

But your technical skill is obvious and very impressive.

If you want to read more, my old bachelor's thesis is somewhat related, from when we only had word embeddings and document embeddings were quite experimental still: https://ad-publications.cs.uni-freiburg.de/theses/Bachelor_J...

I've done a lot follow-up work in my startup Scitodate, which includes large-scale graph and embedding analysis, but we haven't published most of it for now.

A golf membership can cost 1000s of euro.. Any hobby costs money
Thanks for sharing, I'll have a read, looks very relevant and interesting!
As an EU-based engineer, you wouldn't do this, it's a massive GDPR violation (failure to notify data subjects of data processing), which does actually have extraterritoriality, although I somehow doubt that the information commissioners are going to be coming after OP.
Processing comments on a forum being a violation of the GDPR? That's crazy, the OP is neither the data controller (HN is) nor a data processor on behalf of the controller. If you post your data in public, it's not a GDPR violation for people to use it for things.