|
|
|
|
|
by netnichols
4049 days ago
|
|
They probably don't care about that content. My first guess would be that they snapshot the DOM in the JS tick immediately after window.onload completes. Maybe they have a short pause to let any fast timeouts or callbacks complete, but there's got to be a cutoff at some point (e.g. to stop an infinite wait for pages that continuously update a relative date). Of course, with their own JS engine, I bet they can get really fancy with the heuristics to determine when to take that snapshot. |
|
If they're smart, they actually make the exact timeout a function of a HMAC of the loaded source, to make it very difficult to experiment around, find the exact limits, and fool the indexing system. Back in 2010, it was still a fixed time limit.
Source: executing JavaScript in Google's indexing pipeline was my job from 2006 to 2010.