Hacker News new | ask | show | jobs
by zirkonit 1976 days ago
First off -- the author has done an amazing tutorial, it's very enjoyable, so I am by no means throwing a shade.

But a week of TPUv3-128 is anywhere between $10k and $20k in TPU costs alone; saying that this is an "at home" kind of experiment is cheeky at best, clickbait at worst.

2 comments

Hi, I love that you enjoyed it!

Yeah I totally get your point about the title—the TPU quota that I got was close to about the equivalent of $20k—but in my defense I don't have any other access to compute beyond anything that I get through the TFRC or through google colab

Yes it's an amazing tutorial. Thank you.

Speaking as a hobbyist, earlier if you had enough determination you could create just about any software if you kept hacking at it long enough. CPU or cost was generally not an issue, your time and tenacity was.

This has now unfortunately changed and innovation in software (esp ML) is now largely more about how deep are you pockets are.

I think this is quite a rose colored view of the past. Rendering with many graphics techniques was out of reach for hobbyists for a long time for example.
Many hobbies cost $10k-$20k. If you work in engineering, that's not far away from "at home" hobbies.

The time that went into this project was almost certainly worth more than $10k.

I imagine you’re speaking about the cost of e.g. setting up a wood shop in your garage, rather than the cost of making something in said wood shop. Training this seems more like the latter, while the comparable cost is the former.
If you train this model and then use it to do other interesting things, training big models is like a setting up a wood shop.
If your hobby is building wood furniture, a wood shop helps you do that hobby into the future. It will improve your projects, and help your enjoyment of your hobby. The tools also hold some sort of residual value.

If your hobby is building AI/ML models, a one-shot trained model isn’t going to really help you on an ongoing basis. It’s an amazing single shot project, but if your hobby is actually ML then you probably aren’t going to be happy just looking at your completed trained model - you are going to want to train a bigger, better model.

And if your hobby is building software, you can just download a pre-trained model for free.

I don’t think the analogy holds the other way.

You can download a pretrained, full size GPT-2 for $0. Training it from scratch would be merely for fun. You can fine tune the model if you have a specific application for far, far less cost ($0-$10).

It's not comparable to a hobby. It's comparable to paying $10k to make a sandwich.

setting up and growing a garden to make a sandwich from scratch