Hacker News new | ask | show | jobs
by srean 3536 days ago
> but how many times did PageRank need to be invented? After that the true problems at Google were scaling and monetization.

Quite a few times actually. It was not quite obvious at that time how to run Pagerank and other algorithms efficiently at that scale while keeping running costs down. If it was just a library call and delegation away, they wouldnt have had such a meteoric rise.

1 comments

> Quite a few times actually. It was not quite obvious at that time how to run Pagerank and other algorithms efficiently at that scale while keeping running costs down. If it was just a library call and delegation away, they wouldnt have had such a meteoric rise.

We're talking about two different things here—the theoretical PageRank and the practical one. My point is that the skills required to scale a thing like PageRank—writing code to parallelize tasks, divvy up traffic, etc.— are very different than the ones involved in inventing PageRank as an algorithm.

> We're talking about two different things here

No we are not.

Pagerank at its mathematical core was not that novel, basic undergrad stuff. The application was novel, not the equation, you would find that in a beginners linear algebra book. The real deal was (i) realizing that those equations can be applied for solving an aspect of web search and (ii) scaling it up with cheap hardware of that time and keep operational costs low to be profitable. What I am saying is that one needs a good understanding of CS fundamentals and the ability to reason to pull that off with a competitive advantage. You dont get that just by tweaking CSS or for example knowing your Java platform well or by delegating. These kind of problems are not one off. You have to keep ahead of the competition constantly, innovate constantly, have to do stuff that your competition has not yet figured out how to do.

Now that this particular scaling problem has been in the mainstream it does not seem that big a deal to solve, it was at that time. If it hadnt been, every run of the mill tech company would have been doing it to eat Google's lunch. Their manager's ability to delegate did not seem to have helped them much there.

>> We're talking about two different things here

> No we are not.

> Pagerank at its mathematical core was not that novel, basic undergrad stuff.

Whatever you say, Mr. Page. :)

> You dont get that just by tweaking CSS or for example knowing your Java platform well or by delegating. These kind of problems are not one off. You have to keep ahead of the competition constantly, innovate constantly, have to do stuff that your competition has not yet figured out how to do.

More false analogies here. The skills involved in solving technical problems are easily translatable from one technical domain to another. You make it seem like implementing an algorithm to scale servers is necessarily more complex than implementing an algorithm to stack shapes on a webpage in a space-efficient way. It's not.

> Their manager's ability to delegate did not seem to have helped them much there.

You're completely misunderstanding me. I'm not using the term "delegation" here as a managerial term. I would bet you that the team that scaled PageRank relied on countless open-source and freely available tools and tech that others wrote. This doesn't lessen their ingenuity at all, but it should be clear that even the most complex tech is built on the shoulders of others.

Lol do you know PageRank? At its core it's just a random walk on a graph; this stuff is taught in intro linear algebra courses. Of course, modelling the web this way and tweaking the middle to produce optimum results was a big deal. You'd be a fool to believe that PageRank hasn't evolved in 20 yrs.
> Lol do you know PageRank?

Oh, please. This back-and-forth and condescending attitude is getting so tiring for me. Yes, I've read the paper multiple times.

> this stuff is taught in intro linear algebra courses

I graduated with a B.S. in Applied Mathematics from Yale. I'm familiar with this stuff, thanks.

> At its core it's just a random walk on a graph;

I'm tired of making this point. There is a big difference between understanding a ground-breaking discovery and making the discovery itself.

> Of course, modelling the web this way and tweaking the middle to produce optimum results was a big deal. You'd be a fool to believe that PageRank hasn't evolved in 20 yrs.

I never once said this. I said that after PageRank was developed, most of Google's engineering resources went into scaling and monetization.

Scaling has its own set of algorithmic challenges; at the small scale there's not much to be gained from asymptotic complexity improvements, but at the scales Google operates at it definitely is the case; I'd wager that a lot of work Google does on its distributed computing platforms involves algorithmic challenges.