Hacker News new | ask | show | jobs
by ndesaulniers 2976 days ago
> I'm not sure what you mean by google does the entire stack.

Consider that Google has some of the best machine learning researchers, compiler engineers, hardware engineers, and infrastructure in the business working on this.

1 comments

Huh? Machine learning and infrastructure Engineers, yes. Compiler and Hardware engineers? No. What gives you reason to believe they have a lead in either of those departments other than they have a lot of money? They're forced to use the same foundry as Nvidia, and their Hardware team is likely significantly smaller.
Google been buying up AI resources well before anyone else and has the strongest and deepest team at this point.

It is why so many of the break throughs have come from Google. Great example is winning at Go almost a decade earlier than anyone thought possible.

They probably two of the strongest teams with one the Brain team and then the Deepmind team. But all the other engineers and infrastructure is first rate at Google.

Really at this point do not think the $100B cash is as important as Google already built the team and now experinced resources are far more difficult to get.

The other advantage for Google is their ability to attract the top engineers in addition.

Google just got started a lot earlier on all of this.

Google got started a lot earlier on this? Did you read what you are saying? Nvidia has been making hardware longer than Google has been a company. No, Google does not have a better hardware team. Google has the luxury of making a device that is used for a single purpose that they control. Nvidia made a device that can be used for far more and works on commodity hardware. By the way, deepmind/alphago uses Nvidia GPUs, so that was an extremely bad example.
BTW,. Deepmind now uses TPUs both for training and inference and with the results we can see why.

https://www.theverge.com/circuitbreaker/2016/5/19/11716818/g... Google reveals the mysterious custom hardware that powers AlphaGo

Hardware optimize for NN. Nvidia dominate focus had been graphics. Big difference which we can see the results in this article.

Plus benefits not having the baggage that Nvidia would have.

But never going to be able to use a TPU for graphics.

In the end it is about results.

Tensor cores are hardware optimized for NN. You call it baggage, Nvidia calls it extra revenue. Because some people need double precision, and those people are willing to pay a lot of money. So the V100 continues to be the cheapest way to train and do inference on NN because you can actually amortize the server cost over time. With tpu, you pay the hourly price forever. TPU are better only in the case of NN jobs that are short in length or you don't have the capital to buy a server. Anything longer, you can buy a Titan v and come out far ahead.

By the way, the Tesla cards have no graphics output, so I'm sure why you'd say they have graphics baggage.

The problem for Nvidia is they do NOT do the entire stack. So Google has the ability to better optimize and here we are seeing those results as using TPUs is about 1/2 the price of Nvidia hardware.

Baggage is a company thing. Google really has been an AI company since in the late 90s when Larry Page was asked about using AI to improve search and he replied he was using search to make AI happen.

Ha! When you amortize you are still spending money and you saying this really bothers me and is such a problem.

Too many look at things like you do and why companies get into problems. Capitalizing is not magic.

BTW, Google is also going to be able to iterate much quicker as the AI breakthroughs happen and come out with new versions that should stay well ahead of Nvidia.

The dynamics of the chip business have changed. Use to be companies bought chips from someone and then put them in to servers and sold the servers.

The problem is the company making the chips are NOT running the chips and do not have any skin in the game or the data needed to improve.

Now we have companies like Google making the chips and also running the chips and why we see power footprint being the focus far more than the past.

We will see all the big operations including Amazon make their own chips more and more.

A perfect example if Capsule networks replacing some uses of CNNs. Google with Hinton developed the Capsule network approach and will be supporting it far faster then you will see from Nvidia.

Then there is the canonical framework for AI being TF.

All of this was theoretical advantageous for Google and now we get to see they appear to be real with the pricing of the TPUs being about half of the cost of using Nvidia.