|
|
|
|
|
by timdettmers
4119 days ago
|
|
If you have no desktop PC or no money for a GPU, it might be a better choice to use a EC2 instance instead of buying the hardware. You pay about $11 a week for a EC2, which is quite good once you compare it against the electricity costs that come on top of running a personal computer. The downside is that you have a slow EC2 GPU with 4 GB RAM. Conv nets that take 3 weeks on a EC2, will take less than 2 weeks on a GTX980. If you run large conv nets, the 4 GB can be limiting (for example on ImageNet or similarly sized data sets). Another point is that is more convenient to work on your own desktop and you can run multi-GPU nets, which is not possible on EC2 because the virtualization kills the memory bandwidth between GPUs. If you think about it, over the long term a personal system will just be more cost efficient (you can keep a good system for years). So for deep learning researcher and those that apply deep learning this is just the most cost effective option. A example calculation: You can buy a faster system than a EC2 for roughly $400 (GTX 580 + other parts from eBay). Together with electricity costs thats about 1 year worth of EC2, or 2 years worth if you use deep learning sporadically. A high end deep learning system will be about $1000-1400, which is about 3 years worth of EC2. So a EC2 makes good sense, if you use deep learning only sporadically and work with small data sets. If you use deep learning heavily, want a faster system or want to use multiple GPUs, a personal system will be better. |
|
As soon as you're dealing with EC2, you have to take on the mental overhead of making sure that all your configuration persists between restarts (especially if you're using spot instances!), running start up tasks, mounting EBS and paying for volumes, etc. and in my experience this all really adds up. That said, I still do use EC2 for some things. If I wave five similar models I want to run in parallel, it's as simple as spinning up five identical instances. Also, once I start training new models I can continue to use my workstation without any slowdowns.